Datasets

At Knowi, a Dataset results from a Query or a Query with multiple joins. Datasets can be previewed, scheduled, and reused. Datasets can further be used to create visualizations and NLP Queries.

Dataset features:

  • Previewing - A Dataset can be previewed before saving and running it.
  • Sorting and Filtering - Datasets can be sorted/filtered as per requirements. Sorting can be done in ascending or descending order, and filtering can be done based on fields using operations like equals, not equals, greater than, less than, etc.
  • Reusing Queries - Datasets allow you to reuse queries by saving and cloning.
  • Abstract Query execution details - The underlying execution details for the queries are hidden from the users.
  • Creating Multiple Visualizations/Transformations - Datasets can be reused to create multiple visualizations and transformations.
  • Scheduling - Datasets can be scheduled for Alerts and Actions (Send Email/Webhook/Slack/Teams).
  • API enabled: Datasets can be queried via APIs or pushed via Webhooks into your API.

Creating Datasets

Datasets are created when you create a query on the Queries page. There are multiple ways to create/navigate to a dataset.

From the Queries Page

To create a new Dataset, navigate to Queries in the left side panel.

navigate to queries

On the Queries page, click New Query+

new query

In the Queries page, choose a dataset, choose a collection, and give your dataset a name. Click Create and Run to run the query, or click Save Dataset only to save the dataset.

creating a dataset

There is another way to navigate to already existing queries. In the Queries page, navigate to the edit button beside a Query. This will allow you to edit and save the edited dataset.

edit existing queries

From the Dashboard

To navigate to Datasets from the Dashboard, click on the three-dots option on a widget and click Query.

Accessing query from the dashboard

Here's a high-level diagram of Knowi that shows you how Datasets fit in the overall Knowi architecture:

Knowi architecture diagram

Dataset Management

The Dataset Management page provides an overview and allows users to manage Datasets and Queries. To navigate to the Dataset Management page, click Query Details on the Queries page.

Query Details

The Dataset Management page has the following tabs:

  • Overview
  • Data
  • Log
  • Query History
  • Usage

Dataset Managment Page

Overview

The Overview tab on the Dataset management has the following information:

  • Visualizations running on the Dataset
  • Datasources employed on the Dataset
  • Users to whom the Dataset has been shared with
  • Reports scheduled on the Dataset
  • Trigger notification alerts created on the Dataset

Overview options

On the left-hand side, the page has more information and options:

  1. Status of Query - If success, failed, unknown, etc. Clicking on it will redirect you to the Query Log.
  2. Datasource Name - Datasource name given at the time of creation
  3. Number of columns and rows - Number of columns and rows in the dataset
  4. Creator's name - User name and email
  5. Last Dashboard - This takes you to the last used Dashboard
  6. Edit Query - This takes you to the queries page
  7. Delete - Delete the query.

Note - The delete option also prompts if you want to delete the associated widgets.

Additional options

Actions

You can perform a wide array of functions on your Dataset:

  • Creating visualization
  • Scheduling reports
  • Sharing to individual users or groups

Dataset Actions

You can click on the More... button and perform additional functions on your Dataset that include:

  • Adding derived Dataset (adds a derived dataset)
  • Cloning Query (creates a new copy of the query)
  • Adding Trigger Alert (creates and schedules alerts)

More...

Note: If your query is Non-direct, you'll also see the:

  • Run the Query (to run the query now)
  • Add Data Update Alert (to generate an alert when a scheduled query run fails)
  • Add Query Error Alert (to generate an alert when reports fail to run, for failed query run and failed dataset update)

Non-Direct Query Options

Create AI Dashboard

The AI Dashboard feature auto-generates a dashboard with a single click having various widgets based on the dataset. This newly generated Dashboard is also given a name and auto-saved in your workspace in the "AI" category. It is like any other Dashboard, and all the settings and functionalities apply.

Click here to learn how to create a dashboard using AI.

Note: Knowi uses its own AI service, and no data is sent outside of Knowi.

Dataset Diagram

The dataset diagram conveys a high-level view of the dataset, the associated queries, widgets, and drilldowns.

Dataset Diagram

You can hover your cursor over Query, Dataset, and Widget to view, edit, and settings options for the dataset.

  • Queries - The Queries have an edit option that takes you to the Query page.
  • Datasets - Datasets have the edit and view option. The edit option takes you to the dataset edit page, and the view takes you to the Dataset Management page. Here, you can export the Dataset data.
  • Widgets - The Widgets have the settings option that takes you to the Query Settings page.

Explore Dataset Diagram

Data Types

You can modify automatically detected data types by clicking on the Edit button.

Data Types

Once done with modifying the data types, click on the Save Changes button.

Data Types - Save Changes

Search Based Analytics

Search based analytics allows you to control NLP settings. To edit, click Edit.

Search based Analytics

You can manage and edit the Search-Based Analytics settings for the following:

  • NLP Indexer - Enables NLP Indexing for search based analytics. It is turned on by default.
  • NLP Slack Indexer - Enables NLP for Slack.
  • Synonyms - Add/remove NLP synonyms. Multiple keywords can be associated with a field. Instead of a field, you can also specify calculations or Cloud9QL functions. You can add/remove fields using the +/- buttons. For example, if the bounced rate per million is defined as (bounced/sent) *1000000, set the definition as the field name and set the bounced rate per million as the synonym.
  • Data Limit - Limit the amount of records to process for NLP. Defaults to 200K when empty. Set to 0 for unlimited.
  • Skip NLP processing - Specify one or more fields that will be skipped in NLP for the Dataset.
  • Exact Match Fields - Specify one or more text fields that match exactly, including punctuations and spaces.
  • Default NLP Date - Specify the default NLP date field to grab in natural language processing for the dataset. A random date is selected for NLP when left empty.
  • Indexible Fields - Gives you control over setting unique values on String fields. These fields are often used as part of a condition. You can add/remove fields using the +/- buttons.
  • Index values from other Datasets - By default, the values are sampled from the dataset if they are not direct. Pointing to another dataset gives you fine-grained control over the values that can be driven from another dataset. The values are determined for the first column of the dataset, or if the dataset has multiple columns, the column name matches the field to index. You can add/remove fields using the +/- buttons

Once done modifying the Search-Based Analytics, click the Save Changes button.

Search Based Analytics - Save Changes

Visualization Templates

Allows users to set up and manage pre-defined Widgets for this dataset.

Visualization templates

Indexes (For Non-Direct Queries)

For non-direct queries with large datasets, having indexes on commonly used filters can help to accelerate performance and improve the speed of the data retrieval operations.

You can build and edit the indexes by clicking on the Edit button.

Indexes

Data

Your Dataset data can be viewed from the Data tab. You can also search using the global search bar to filter records and export the data into a CSV file.

View and Export Dataset

Log

You can click on the Log tab to view the entries for all the Query execution events you have performed in the past. The entries appear in chronological order and for any given entry, you can view the Query status, Time taken for the Query to execute, and the Number of records it processed.

Dataset Managment Log

The records in the log help you perform internal analysis and optimization of how the Queries are executed.

Note: You can click the Refresh button to revise the entries in the log.

By clicking on the event caret you can view further details for any Query Execution event:

  • Report Code

  • Message

  • Query Execution

  • Saving Result

  • Identifier

  • Agent ID

  • API Key

Log Details

Query History

Query History gives you a view of all historical changes made for the Query, as well the ability to revert to a previous version

Query History

You can restore the Query state to any of the previous versions by clicking on the Revert.

Restore Query

Usage

You can click on the Usage tab to view all the widgets derived from the Query and which dashboards contain them.

Dataset