Book Image

Snowflake Cookbook

By : Hamid Mahmood Qureshi, Hammad Sharif
5 (1)
Book Image

Snowflake Cookbook

5 (1)
By: Hamid Mahmood Qureshi, Hammad Sharif

Overview of this book

Snowflake is a unique cloud-based data warehousing platform built from scratch to perform data management on the cloud. This book introduces you to Snowflake's unique architecture, which places it at the forefront of cloud data warehouses. You'll explore the compute model available with Snowflake, and find out how Snowflake allows extensive scaling through the virtual warehouses. You will then learn how to configure a virtual warehouse for optimizing cost and performance. Moving on, you'll get to grips with the data ecosystem and discover how Snowflake integrates with other technologies for staging and loading data. As you progress through the chapters, you will leverage Snowflake's capabilities to process a series of SQL statements using tasks to build data pipelines and find out how you can create modern data solutions and pipelines designed to provide high performance and scalability. You will also get to grips with creating role hierarchies, adding custom roles, and setting default roles for users before covering advanced topics such as data sharing, cloning, and performance optimization. By the end of this Snowflake book, you will be well-versed in Snowflake's architecture for building modern analytical solutions and understand best practices for solving commonly faced problems using practical recipes.
Table of Contents (12 chapters)

Weeding out inefficient queries through analysis

We will learn about techniques to identify possible inefficient queries through this recipe. The identified inefficient queries can then be re-designed to be more efficient.

Getting ready

You will need to be connected to your Snowflake instance via the web UI or the SnowSQL client to execute this recipe.

How to do it…

We will be querying the QUERY_HISTORY Materialized View (MV) under the SNOWFLAKE database and ACCOUNT_USAGE schema to identify queries that have taken a long time or scanned a lot of data. Based on that result set, we can identify which queries are potentially inefficient. The steps for this recipe are as follows:

  1. We will start by simply selecting all rows from the QUERY_HISTORY view and order them by the time taken to execute:
    USE ROLE ACCOUNTADMIN;
    USE SNOWFLAKE;
    SELECT QUERY_ID, QUERY_TEXT, EXECUTION_TIME,USER_NAME 
    FROM SNOWFLAKE.ACCOUNT_USAGE.query_history 
    ORDER BY EXECUTION_TIME DESC;

    You...