|Head of Architecture – Modernization Practice, Infosys
|Chief Operating Officer, Knowi
Cloud adoption across the industry is accelerating at a pace faster than we can think. The concept of everything as a service (XaaS) is becoming a reality. Security was one of the major blockers for cloud adoption until a few years back, but now that apprehension has been shrugged off by matured cloud features (including security). It is now possible to implement security at each layer of the cloud architecture. The adoption started with infrastructure moving to the cloud first and applications following suit. Now data and analytics are rapidly moving to the cloud. This accelerated adoption is largely driven by the need for better availability, speed, and scalability. But with innovation, comes new types of risk. Those risks, if not managed well, can nullify the advantages of the innovation. With cloud adoption, it becomes extremely important to be able to manage, optimize, and therefore reduce the cost of cloud resource usage.
In this blog, we will specifically talk about Snowflake, a data warehouse built for the cloud, and how we can increase observability into its usage, allowing us to monitor, manage and optimize the cost of its usage. Infosys and Knowi have worked together to develop a cloud usage reporting solution that enables the observability layer. So, what is the blueprint of this solution? Let us take a deep dive into the solution.
The monitoring platform
The monitoring platform offers role-based access (RBAC) to manage and monitor Snowflake usage on the cloud. It leverages Snowflake’s native role-based access approach, meaning a user can monitor and manage the warehouses attached to his(her) role only. The monitoring platform also defines three core monitoring profiles:
- Cost performance monitoring profile
- Query performance monitoring profile
- Usage monitoring profile
The cost performance monitoring profile helps answer important questions related to snowflake usage cost, including:
- What is my annual, monthly, weekly, and daily credit usage and cost trend?
- Which warehouses are burning the greatest number of credits?
- Which are the most expensive queries within the warehouses?
The query performance monitoring profile leverages our Snowflake query optimization experience from various implementations within our customer base. The solution codifies the optimization best practices and provides answers to the following questions:
- Are there any queries that are causing a high transaction blocking time?
- Is there a skew in the warehouse usage?
- Which tables do not have optimized pruning?
- Which tables may benefit from clustering?
- Do I need to change the sequencing of my ETL jobs to reduce warehouse queueing?
The usage monitoring profile provides a view of how the data is being used in the organization. It provides answers to questions like:
- What type of data is most frequently used?
- Which users/roles are most active?
- How many users do I have for each warehouse/database?
- Are the users running any rogue queries?
Visualizations from Cost Performance Monitoring Profile
Summary view of yearly credit usage
Credit Usage view by different dimensions (warehouse, query and year) & Credit Usage Trend (using a sliding window chart)
Summary view of monthly credit usage
Monthly credit usage view by various dimensions
Optimization opportunity view of top expensive queries
The monitoring platform has been designed to provide a hawk-eye view into the usage of Snowflake across the organization, in order to manage and optimize usage and cost. The platform comes with some ready to use views, but it has been designed to be able to accommodate any new views that may be required in the future.