asdfasdfsadf

Queries & Data Exploration

The Query section provides a powerful set of features that helps you build visual insights quickly:

  • Data exploration & query generation capabilities
  • Execute native queries on your database
  • Run Cloud9QL queries on top of the returned result set
  • Ability to join multiple datasources and query against them
  • Ability to save query results into the ElasticStore or run directly against the database in real-time
  • Option to schedule query executions
  • Drag & Drop analytics on the results

Datasource Setup

  1. Click on Datasources on the left toolbar. Choose the datasource you want to work with. By default, Knowi provides live databases for demo purposes to test end to end flow.

image alt text

  1. Connectivity: We provide multiple connectivity options

image alt text

  1. Once the datasource is saved, configure queries by clicking on the link to enter the query page.

image alt text

Building Queries

You have the option to either write the queries directly into the given query box or use the Query Generator to generate queries for you.

Schema on Read and Field Exploration

Depending on the datasource, a list of collections/tables along with field samples (or column names in case of relational databases) will be automatically detected. There are no data models to define or schema definitions required upfront.

For unstructured/semi-structured sources, Knowi uses native APIs for the database to infer fields where supported by the database, or fall back to sampling the data to determine fields.

Query Builder & Writing Queries

Choose Filters, Metrics and dimensions to build your query. The queries are automatically generated on the right as your filters/metrics/dimensions are applied. Filters provide the ability to build a complex query by using logic operators 'And', 'Or'. Also, the 'In' filter condition allows to use several values for each filter. Click on Preview button to preview the data.

image alt text

Instead of using the query generator, queries can directly be created via the Query box on the right.

Cloud9QL can be used to post process/transform the return data, to complement native queries. This is not a replacement to the underlying query, but offers powerful analytics functions on the results returned.

Preview

Use the Preview button to preview the results

image alt text image alt text

To change the visualization settings, click on Modify Visualization Settings on the bottom right

image alt text

Joining Queries

Click on Join Query button (or the icon) to join queries from different datasources.

  1. Choose a datasource to join with
  2. Generate or Write Queries
  3. Specify Join Keys and the type of Join. If there are no join keys applicable if you are blending in data across two datasources, leave it empty. Otherwise, specify the keys to join with. Join Keys must be a field returned in the queries.

Examples:

customer=customer AND joinAfield=joinBField

The default join type is INNER JOIN when a join field is present. Other Join types supported: LEFT OUTER JOIN, FULL OUTER JOIN, RIGHT OUTER JOIN. To specify a different join type, add the join type into the Join Fields. Example: FULL OUTER JOIN customer=customer. See Multi-Datasource Joins for more details.

image alt text

Direct Query vs ElasticStore

Queries can either be "Direct" or the results of a query can be cached. Direct mode will always execute the query directly against the database live when an associated visualization is displayed. The associated load time is directly proportional to the time it takes for the query to execute.

Alternatively, the query results can be stored into the Knowi ElasticStore seamlessly. This is particularly useful in the following cases:

  • Long running queries.
  • Reduce load on your database for reporting workloads by offloading it to the ElasticStore.
  • The results can be used as a parent dataset to other derived queries off this dataset. Useful in cases where the resulting dataset tracked in the ElasticStore runs into the millions of records.
  • In cases with large raw datasets in the underlying database, this can be used to incrementally update the ElasticStore for that dataset without running the entire historical query.

In Warehousing Strategy section, check Direct Query for direct. By default, the results will be stored into the ElasticStore. If ElasticStore is use, specify a schedule for asynchronous query execution.

For more details, see Elastic Store documentation.

image alt text

Once ready, click on Save to just save the query or hit Save & Run Now button to immediately execute the query.

Overwrite Strategy

With ElasticStore usage, Overwrite strategy provides you powerful control on how the data is updated for the dataset when the query is run. The values can be one of the following:

  1. All will replace the existing values in the store for this dataset with new incoming values during query execution.

  2. Empty will leave the existing results as is and append incoming data to it.

  3. One or more field names, comma separated: Will update the existing dataset with new values for the same values of the fields, insert new records when a match is not found and leave existing recrods as is. This is useful for incrementally updating the dataset. For example, let's say the raw data that a query operates on has 100's of millions of events, and say the query is doing a grouping by count, date and event type. In this case, trolling through the entire 100's of millions of records each time the query runs can induce loads into your system. You can structure the query in the to something like the following:

        select count(*) as Totals, date(eventTime) as Date, EventType
        where eventTime >= $c9_yesterday
        group by EventType, date(eventTime)
    

    with Overwrite Strategy of Date, EventType and scheduled every few hours. In this example, it'll query only data from yesterday and the results are then upserted into the dataset.

  4. <date field name>-<retention period>: Retention overwrite strategy for a date type field can be defined by specifying how far back in time we would like to keep our data. The retention period can be defined in the Time Unit format. For example, Date-1m will remove all existing rows in our elastic-store which have Date field's value less than 1 month before the MIN value of the incoming data's Date field. This can also be used in conjunction with our existing field value overwrite (#3) above. For example, to add retention of 3 months to the example in #3 above, we just need to make the Overwrite Strategy to be Date, EventType, Date-3m

Query Listing

Once report is saved, it will be listed in the Queries section.

image alt text

Each query listing comes with several options on the right:

image alt textRun Now: This option runs the query. By default, all the queries are saved in the ElasticStore so every time, this option is chosen, the query is executed against the warehouse. If chosen otherwise, the query is executed real-time against the database.

image alt textEdit: This option lets you edit the queries and save them again.

image alt textData Config: This option tells you about the underlying structure of the query. For example, the figure below gives the following information:

Query Report named TestReport is executed on the dataset named TestReport and is represented by the widget called TestReport image alt text

image alt textAdd a Derived Dataset: This option allows you to create new derived queries and datasets from the data. The original dataset will be the parent and the new one will be derived from it. Applicable for data stored in ElasticStore.

image alt textShare: Lets you share the query report with other user groups or users.

image alt textClone Lets you build a query, using the existing query as a baseline.

image alt textDelete: Deletes a query. You can opt to remove all widgets/visualizations associated with the particular query.

image alt textHistory: This option provides you with the list of all historical changes made for this query. All the changes are tracked and you can easily get back to them in case of necessity.

Notes: Additional revert button is available as soon as any historical item is expanded. You can use this button in order to revert the query state to the one that you selected.

Click on Download Queries button if you want to download the queries.

image alt textCancel (Beta): Cancel a currently running query. The behavior of this action depends on datasource type of the running query.

Widgets/Visualizations/Adding to Dashboards

When a report is created, a visualization is automatically created for it.

To add the visualization into to a dashboard as a widget, go to a dashboard (or create a new one), drag and drop the widget into the dashboard.

image alt text