Annotating to Extract FAQs

You may have all the FAQs related to your business in a PDF file but not in the format mandated by SearchAssist. Annotate such documents identifying the key sections of the content from a few pages of the document. SearchAssist uses this identified pattern from annotation to extract the FAQs from the document.

Note: This feature is applicable only when extracting FAQs from PDF documents.

  1. Select a PDF file for extraction.
  2. Select the Annotate & Extract option.  Click Proceed.
  3. The PDF document is loaded into the Annotation Tool allowing you to annotate the various sections in the document.
  4. To annotate, select the text and tag it as follows:
    • Heading:  Apply Heading tag to train the App so that it can identify the question. The content between any two consecutive headings is extracted as the answer for the preceding heading.
    • Header:  Avoid random marking of texts as headers. Marking text such as a footer or paragraphs as the header produces invalid results.
    • Footer: Apply Footer tag to train the App so that it can identify and ignore the footers. Avoid random marking of texts as footers. Marking text such as a header or paragraphs as the footer produces invalid results.
    • Exclude: Apply Exclude tag to prevent the extraction  of that section. 
    • Ignore Page:  Apply Ignore page tag to pages to be excluded from extraction.
    • Remove Annotation: Apply this feature to undo any incorrect annotations and to start annotating afresh. 
  5. The App uses the headings, headers, and footers in the extraction process and can learn from it. You need not annotate the entire document. Annotate a couple of pages with headings, headers, and footers, extract and review the questions.
  6. The feature generates Additional document information:
    • Document Info includes Name, Size, and the Number of Pages of the document.
    • Annotation Summary  includes Number of annotations marked for each category for the particular page and the entire document.
  7. After you annotate, click Extract to apply the annotation to the entire document and extract FAQs from it in bulk.
  8. The extracted FAQs are listed under Drafts and mark the beginning of the FAQ review workflow. Refer  How to use an  FAQ Workflow.