Datasets

A Dataset is the central principle of everything that can be done in Knowi. At the most simple level, a Dataset is the result of a Query or a Query with multiple joins.

Datasets allow you to:

  • Reuse Queries

  • Abstract underlying Query execution details (Direct vs Non-Direct, Insert/Update strategies, etc)

  • Define Security/Governance/Row-Level Security

  • Create Multiple Visualizations/Transformations from the same dataset

  • Natural Language Queries (Search-Based Analytics) are executed on top of the dataset.

  • Define Alerts and scheduled actions (Send Email/Webhook/Slack/Teams).

  • API enabled: Datasets can be queries via APIs or pushed via Webhooks into your API

A dataset is created when you save and run a query(ies). The Dataset page enables you to manage the following:

  • Overall Summary

  • Dataset Lineage (Query/Dataset/Visualizations flow)

  • Query Execution Details

  • Search-Based Analytics configuration

  • Data Types

  • Query History

  • Usage

    Here's a high-level diagram of Knowi that shows you how Datasets fit in the overall Knowi architecture:

    image alt text

Overview

The Overview tab on the Dataset page provides you a control room with the following:

  • Number of visualizations running on the Dataset

  • Number of datasources employed on the Dataset

  • Number of users to whom the Dataset has been shared with

  • Number of Reports scheduled on the Dataset

  • Number of Trigger notification alerts created on the Dataset

image alt text

Apart from the above, the Dataset page also provides insights into:

  1. Status of Query

  2. Datasource Name

  3. Number of Columns and Rows

  4. Creator's Name

image alt text

Dataset Diagram

The Dataset diagram conveys a high-level view of the data lineage that shows you Query, Dataset, and Widgets on the current dataset.

image alt text

You can also edit and modify your Query, Dataset, and Widget via Dataset diagram. You can hover your cursor over Query, Dataset, or Widget and click on the edit to start making modifications.

image alt text

Data Types

You can modify automatically detected data types by clicking on the Edit button.

image alt text

Once done with modifying the data types, click on the Save Changes button.

image alt text

Search-Based Analytics

You can manage and edit the Search-Based Analytics settings for the following:

  • Turn off NLP Indexer

  • Turn off NLP Slack Indexer

  • Specify one or more synonyms that can be used NLP for the Dataset.

  • Limit the number of records to process for NLP

  • Specify one or more fields that will be skipped in NLP for the Dataset

  • Specify default NLP date to fetch in NLP processing for the Dataset

image alt text

Once done with modifying the Search-Based Analytics, click on the Save Changes button.

image alt text

Perform Actions

You can perform a wide array of functions on your Dataset that includes:

  • Creating visualization

  • Scheduling reports

  • Sharing to individual users or groups

image alt text

You can click on the More button and perform additional functions on your Dataset that include:

  • Adding Trigger Alert

  • Adding derived Dataset

  • Cloning Query

image alt text

Note: If your query is Non-direct, you'll also see:

  • Run the Query (to run the query now)

  • Add Data Update Alert (to generate an alert when a scheduled query run fails)

  • Add Query Error Alert (to generate an alert when reports fail to run, for failed query run and failed dataset update)

image alt text

Indexes (For Non-Direct Queries)

For non-direct queries with large datasets, having indexes on commonly used filters can help to accelerate performance and improve the speed of the data retrieval operations.

You can build and edit the indexes by clicking on the Edit button.

image alt text

Data

Your Dataset data can be viewed from the Data tab.

image alt text

Log

You can click on the Log tab to view the entries for all the Query execution events you have performed in the past. The entries appear in chronological order and for any given entry, you can view the Query status, Time taken for the Query to execute, and the Number of records it processed.

image alt text

The records in the log help you perform internal analysis and optimization of how the Queries are executed.

image alt text

You can view further details for any Query Execution event by clicking on the caret:

  • Report Code

  • Message

  • Query Execution

  • Saving Result

  • Identifier

  • Agent ID

  • API Key

image alt text

Query History

Query History give you a view of all historical changes made for the Query, as well the ability to revert to a previous version

image alt text

You can restore the Query state to any of the previous versions by clicking on the Revert.

image alt text

Usage

You can click on the Usage tab to view all the widgets derived from the Query and which dashboards contain them.

image alt text