Snowflake Data Analytics Tutorial

Snowflake Data Analytics

Table of Contents

Introduction

Snowflake is a purpose-built SQL cloud data platform that has grown at a nearly unprecedented rate since launching in 2014; Okta Inc.’s 2020 Businesses @ Work report found that Snowflake was the world’s fastest-growing app. Snowflake’s growth is easy to reconcile given their unmatched flexibility, top-of-the-line security, automatic scaling of storage, and seamless integration with various BI tools. 

Among the BI tools that offer Snowflake integration, only one is fully native to Snowflake with support for nested objects and arrays: Knowi. This allows users to simultaneously analyze their data while reaping the benefits of Snowflake’s scaling and security. If you’d like to learn more about using Knowi to analyze your Snowflake data, this tutorial is for you.  

Creating a Snowflake Datasource

Once you’ve set up your free Knowi trial account and logged in, follow these steps:

1.      Locate and click on “Data sources” on the panel on the left side of your screen. 

2.      Scroll down to “Data Warehouses” and click on Snowflake.

3.      Right now, the default Schema Name is set to TPC-DS which contains data on products, orders, and customers. We’re going to change the schema to TPC-H, which contains data on decision support systems. In order to do this, change Schema Name from TPCDS_SF100TCL to TPCH_SF1 and click “Test Connection.” 

4.      In a few moments, Knowi should tell you that your connection was successful; click “Save” once it does. 

Congratulations on setting up your first Snowflake datasource!

Querying Your Datasource

Now that you’ve created a datasource, you can run queries on your data by following these steps:

1.      As soon as you saved your datasource, you should’ve received a “Datasource Added. Configure Queries.” alert at the top of your page. Click on the word Queries. (You can also just go back to the panel on the left side of your screen, go right below “Data Sources,” click on “Queries,” and select “New Query +” from the top right.)

2.      Name your report inside the “Report Name*” bar on the very top left of your screen. The query that we’re using here will be closely modeled off of the default functional query that is provided in the TPC-H schema, so let’s name this one “Functional Query.” 

3.      This default query schema lists the totals and averages for extended price, discounted extended price, and discount extended price plus tax, as well as total charge and a count of the number of line items, and groups this data by return flag and line status. In order to enter this query, head over to “Snowflake Query” in your Query Builder and enter the following syntax: 

select
       l_returnflag,
       l_linestatus,
       sum(l_quantity) as sum_qty,
       sum(l_extendedprice) as sum_base_price,
       sum(l_extendedprice * (1-l_discount)) as sum_disc_price,
       sum(l_extendedprice * (1-l_discount) * (1+l_tax)) as sum_charge,
       avg(l_quantity) as avg_qty,
       avg(l_extendedprice) as avg_price,
       avg(l_discount) as avg_disc,
       count(*) as count_order
 from
       lineitem
 where
       l_shipdate <= dateadd(day, -90, to_date('1998-12-01'))
 group by
       l_returnflag,
       l_linestatus
 order by
       l_returnflag,
       l_linestatus;

4.      Before you run this query, we want to add one more column that concatenates return flag and line status. This will make it much easier to visualize our data. In order to do this post-processing with Cloud9QL – Knowi’s powerful SQL style language – enter the following syntax into “Cloud9QL Query:” 

select concat(l_returnflag, " - ",  l_linestatus) as Flag - Status, *

5.      Head to the bottom of your screen and click the blue “Preview” button. This should return four rows and eleven columns worth of data. Once you’ve confirmed that it does, click the green “Save & Run Now” button on the bottom right corner of your screen. 

When you ran your query, Knowi automatically saved your results as a virtual dataset and stored those results as a dataset in its elastic data warehouse. Every time you successfully run a query, Knowi does this. 

Analyzing Your Data and Adding Visualizations

As you saw in the data preview, there are 4 different combinations for return flag and line status: A-F, N-F, N-O, and R-F. Let’s say we want to visualize various metrics, such as the total sum of order quantity, grouped by these separate flag – status combinations. Knowi allows us to efficiently visualize this in just a few steps: 

1.      Return to the top of the panel on the left side of your screen and click on “Dashboards.” Click the orange plus icon and name your dashboard “TPC-H Visualizations.” 

2.      Go just below “Dashboards” on your panel and click “Widgets.” Drag your “Functional Query” widget onto your new dashboard. 

3.      Right now, your widget is just a data grid containing the results from our query. We want to keep this data grid, but to add a more compelling visualization that more quickly conveys a message. Head over to the top right corner of your widget, click on the 3 dot icon and click on “Analyze.” 

4.      Drag the “Flag – Status” and “Sum_QTY” bars over from the left side of your screen into the “Fields/Metrics:” box. This shows us the total quantity for each combination. Now, head to the top of your screen and click on “Visualization.” Scroll over to “Visualization Type” in the top left corner of your screen and change this from “Data Grid” to “Donut.” 

5.      This donut chart already rather clearly conveys that roughly one quarter of the total quantity falls into each of R-F and A-F, roughly falls into N-F, and roughly half falls into N-O, but it may still help to add value labels and percentages to the chart. To do this, scroll down to “Options” and click on it, then check the boxes underneath “Display as Percent” and “Label-Value.” 

6. Now, head back to the top right corner of your screen and click the “Clone” icon that looks like two small pieces of paper. When you do this, you’ll be asked to name your cloned widget; name it “Total Quantity by Flag – Status.” 

7. Click “Clone” and then click “Add to Dashboard” to add your new widget to your dashboard.

There is a clear conclusion that comes from this visualization: roughly one quarter of the total quantity falls into each of R-F and A-F, roughly one half falls into N-O, and roughly one percent falls into N-F. 

Using Drilldowns on Your Visualization

Drilldowns add an interactive component to your visualizations that allows you to dive into a filtered section of your data with just one click. Follow this process to add a drilldown: 

1.      Click on the 3 dot icon in the top right corner of the “Total Quantity by Flag – Status” widget that you just created, then scroll down and click on “Drilldowns.” 

2.      Set “Widget” as your drilldown type, set it to drill into “Functional Query” when “SUM_QTY” is clicked. Then add an optional drilldown filter that sets “SUM_QTY” equal to “SUM_QTY” and click “SAVE” and “Close.” 

3.      Test your drilldown by clicking on N-O, the Flag – Status combination with the largest share of the total quantity. This shows you all of the raw data for that combination. To return to your original visualization, go to the top right corner of your widget and click on the left arrow icon in the middle. 

Using Search-Based Analytics to Query Your Data

Now that you’ve set up a dashboard and become familiar with creating visualizations in Knowi, the next step is to start querying your data using search-based analytics. This feature makes your dashboard accessible to all English speakers – even those who aren’t data savvy or familiar with Knowi. Here’s how to query your data with search-based analytics:

1.      Head to the top right corner of your original “Functional Query” widget and click on the 3 dot icon. Scroll down and click on “Analyze.”

2.      Let’s say that you don’t care about the return flag for any of this information; you just want to group your data by line status, and look at the total amount that has been charged for those with a line status of F and O. Just head to the search bar at the top of your screen that currently says “Ask a question of your data” and type “total sum charge by line status.” 

3.      Now, to visualize this data, we’re going to follow the exact same process that we did before. Return to “Visualization” and set the visualization type to “Donut.” As you did before, scroll down and click on “Options,” then check the boxes underneath “Display as Percent” and “Label – Value.” 

4.      Return to the top right and click on the “Clone” icon again. Name this widget “Charge by Status” and add it to your dashboard. 

This visualization clearly conveys that orders with a status of F and orders with a status of O both share roughly 50% of the total charge. 

Summary

To review, we started off by connecting to a sample Snowflake Datasource and ran a functional query on it. The results of this query were stored as a dataset in Knowi’s elastic data warehouse. Afterwards, we asked a question about our data, created a visualization that clearly answered it, and added drilldowns to our visualization that let the user dive deeper into the data behind the visualization. Finally, we asked our data another question in plain English using search-based analytics, and created another visualization which conveyed our answer. 

Image From Ios
Hey, I hope my article was useful.

I’m Patrick, a solutions engineer here at Knowi.

If you’re looking for an analytics or BI solution I’d be happy to hop on a 15 min call and chat about your use-case.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email
Written by

More To Explore