You may have all the FAQs related to your business in a PDF file but not in the format mandated by SearchAssist. Annotate such documents identifying the key sections of the content from a few pages of the document. SearchAssist uses this identified pattern from annotation to extract the FAQs from the document.
Note: This feature is applicable only when extracting FAQs from PDF documents.
- Select a PDF file for extraction.
- Select the Annotate & Extract option. Click Proceed.
- The PDF document is loaded into the Annotation Tool allowing you to annotate the various sections in the document.
- To annotate, select the text and tag it as follows:
- Heading: Apply Heading tag to train the App so that it can identify the question. The content between any two consecutive headings is extracted as the answer for the preceding heading.
- Header: Avoid random marking of texts as headers. Marking text such as a footer or paragraphs as the header produces invalid results.
- Footer: Apply Footer tag to train the App so that it can identify and ignore the footers. Avoid random marking of texts as footers. Marking text such as a header or paragraphs as the footer produces invalid results.
- Exclude: Apply Exclude tag to prevent the extraction of that section.
- Ignore Page: Apply Ignore page tag to pages to be excluded from extraction.
- Remove Annotation: Apply this feature to undo any incorrect annotations and to start annotating afresh.
- The App uses the headings, headers, and footers in the extraction process and can learn from it. You need not annotate the entire document. Annotate a couple of pages with headings, headers, and footers, extract and review the questions.
- The feature generates Additional document information:
- Document Info includes Name, Size, and the Number of Pages of the document.
- Annotation Summary includes Number of annotations marked for each category for the particular page and the entire document.
- After you annotate, click Extract to apply the annotation to the entire document and extract FAQs from it in bulk.
- The extracted FAQs are listed under Drafts and mark the beginning of the FAQ review workflow. Refer How to use an FAQ Workflow.