Knowi Documents

Knowi Documents allows you to leverage Knowi?s AI capabilities to extract answers to questions from your documents (pdf, docx, etc.) and store them in a structured format as a dataset.

This tool connects to the files you?ve uploaded via the Document AI, enabling you to store questions and answers from your documents (Ask Question Mode) and even extract the values of specific fields (Get Data from Documents Mode) and save them in a tabular format. Once the dataset is created, it can be utilized across Knowi, just like any other dataset.

Documents and a Datasource

Add Knowi Documents as a Datasource

  1. Upload your documents in the Document AI before you connect your documents as a datasource.

  2. Select Queries from the left sidebar menu.

  3. Click on the NEW DATASOURCE+ button from the top right corner of the interface.

  4. Select Knowi Documents.

  5. Name the document datasource. Add Datasource

  6. (Optional) Use the Access Control List (ACL) button to specify which documents are included in the dataset.

    Separate datasets can be created for different document groupings. It is possible to explicitly include or exclude specific documents, or apply a regex pattern to match the desired file names.

    Using the ACL to only include contracts in the "Contracts" datasource.

  7. Click on Test Connection to confirm a successful connection to the datasource, hit the Save button, and start Querying.

Save Datasource

Querying Your Documents

Start a New Query

  1. Select NEW QUERY+ in the top right corner of the queries page. New Query
  2. Choose the Datasource Select your newly created Knowi Documents datasource from the drop-down menu. It may have a different name if you renamed the datasource upon creation. Choose Datasource
  3. Select a Mode

    Knowi Documents provides two modes for querying your documents:

    • Ask Question: This mode lets you ask a question about the contents of your documents, similar to the Document AI chatbot. However, here the answer is recorded into a table. The question, answer, and source, are all included in a single row of a dataset.

    • Get Data from Documents: This mode extracts specific field values from your documents. You can prompt the AI to search for specific values and input those values into a column of your choosing.

    • Get Batch Data: This mode bulk-extracts prompted data from multiple documents and records the results into a table. Batch mode processes documents one by one, making multiple LLM calls as needed, so context size is not a limiting factor. This allows you to extract larger amounts of tabular/structured data across many files in a single run.

Select Mode

Mode: Ask Question

The Ask Question mode allows you to ask a question of your Knowi Documents using natural language. It operates similarly to the AI assistant chatbot in the Document AI, but the main difference is that the response is recorded into a Knowi dataset. Ask a question related to your document, and the AI provides an answer based on the content it extracts. This response is delivered in a single row, regardless of how many questions you ask.

Inputs:

Steps:

Step 1: Fill in the required input parameters in the query builder.

In this example, we've asked "What is the date of the midterm exam for the calculus class?" of a Knowi Documents datasource that includes Syllabi documents.

Ask Question

Step 2: Select Preview to review the output and ensure it meets your needs. If necessary, go back to Step 1 to adjust the inputs and refine the results.

The output of "Ask Question" mode will always include three columns (question, answer, and sources) and one row. If you ask multiple questions, they will all be answered together.

Ask Question, Preview Results

Mode: Get Data from Documents

In the Get Data from Documents mode, the AI extracts structured data from your documents and organizes it into a table with custom column names. Simply ask questions and specify the field names for the table output, and the AI will generate the requested data.

Inputs:

Steps:

Step 1: Fill in the required input parameters in the query builder.

Get Data from Documents

Step 2: Select Preview to review the output and ensure it meets your needs. If necessary, go back to Step 1 to adjust the inputs and refine the results.

Get Data from Documents - Preivew Output

Step 3: Your data might need additional post-processing. This can be done using the drag and drop editor in the Preview view, or in the Cloud9QL Transformations section.

Mode: Get Batch Data

In the Get Batch Data mode, the AI extracts structured data across many documents at once and records the results into a dataset. Batch Data mode processes documents one by one and can make multiple AI calls per run. This makes it a better fit for larger document sets and broader extraction, while still keeping the results organized in a table. Batch jobs are designed to scale, but they are constrained by runtime limits (for example, a ~30 minute timeout). If your extraction doesn?t finish within the runtime limit or you need to process more documents, you can refine your document selection and run the query again to continue loading additional results.

Inputs:

Steps:

Step 1: Fill in the required input parameters in the query builder.

In this example, we've asked "Get the following information from all documents: Company name, Site Name, Date of survey, Report prepared by, Recommendation number, Recommendation tile, Recommendation description, Recommendation aim, Recommendation status" of a Knowi Documents datasource.

Batch Data

Step 2: Select Preview to review the output and ensure it meets your needs. If necessary, go back to Step 1 to adjust the inputs and refine the results.

Step 3: Your data might need additional post-processing. This can be done using the drag and drop editor in the Preview view, or in the Cloud9QL Transformations section.