- Displays query history for the last 14 days - Users can view other users queries but cannot view their query results

- highly available service residing in the cloud services layer - maintains metadata on object information and statistics for each cache type - sometimes referred to as the metadata store cloud services layer or just services layer

Performance Concepts: Query Optimization Flashcards by Chris Schwarz

History Tab

Displays query history for the last 14 days
Users can view other users queries but cannot view their query results

How well did you know this?

Not at all

Perfectly

Analysis of Query Duration: Compilation Time

Query engine performing operations like cost based optimization and micropartition pruning

How well did you know this?

Not at all

Perfectly

Analysis of Query Duration: Execution Time

Steps to process the data, processing time on the CPU

How well did you know this?

Not at all

Perfectly

Query Analysis: Profile Overview: Processing

The percentage of the total execution time spent on data processing by the CPU of the virtual warehouse

How well did you know this?

Not at all

Perfectly

Query Analysis: Profile Overview: Local Disk I/O

-The percentage of time processing was blocked by local disk access
- In other words, when the virtual warehouse had to read or write data from the local SSD storage

How well did you know this?

Not at all

Perfectly

Query Analysis: Profile Overview: Remote Disk I/O

When the virtual warehouse had to read or write data from the remote storage layer

How well did you know this?

Not at all

Perfectly

Database Order of Execution

Rows->Groups->Result

How well did you know this?

Not at all

Perfectly

Rows statements

FROM
JOIN
WHERE

How well did you know this?

Not at all

Perfectly

Group statements

GROUP BY
HAVING

How well did you know this?

Not at all

Perfectly

Result statements

SELECT
DISTINCT
ORDER BY
LIMIT

How well did you know this?

Not at all

Perfectly

Bytes spilled to local storage

Volume of data spilled to virtual warehouse local disk

How well did you know this?

Not at all

Perfectly

Bytes spilled to remote storage

Volume of data spilled to remote disk.

How well did you know this?

Not at all

Perfectly

Caching Layers

Services Layer
Warehouse Layer
Storage Layer

How well did you know this?

Not at all

Perfectly

Services Layer Cache Categories

Metadata Cache
Results Cache

How well did you know this?

Not at all

Perfectly

Metadata Cache

highly available service residing in the cloud services layer
maintains metadata on object information and statistics for each cache type
sometimes referred to as the metadata store cloud services layer or just services layer

How well did you know this?

Not at all

Perfectly

What does the metadata cache store?

Study These Flashcards

-maintains understanding of what tables create, what databases those are in, what are the micro partitions that make up those tables, clustering information and more
- enables queries to be executed without the need of a virtual warehouse
- For example, the row count of each table is stored in a cache allowing for executing something such as a select count without using the virtual warehouse
- Functions, context functions, describe commands, and show commands make use of metadata

Results Cache

Study These Flashcards

Also referred to as Result 24 hour
Each time a persistent result is reused, the 24 hour retention period is extended up to a maximum of 31, after which the result is purged

To reuse a result, ‘Result Cache’ Rule

Study These Flashcards

New query exactly matches previous query
The underlying table data has not changed
The same role is used as the previous query
If time context functions are used, such as CURRENT_TIME(), the result cache will not be used
Result reuse is disabled using the session parameter USE_CACHED_RESULT

Warehouse Cache

Study These Flashcards

Virtual warehouses have local SSD storage which maintains raw table data used for processing a query
The larger the virtual warehouse the greater the local cache
It is purged when the virtual warehouse is resized, suspended, or dropped
Can be used partially, retrieving the rest of the data required for a query from remote storage

Materialized View

Study These Flashcards

a pre-computed and persisted data set derived from a SELECT query
are updated via a background process ensuring data is current and consistent with the base table
improve query performance by making complex queries that are commonly executed, readily available
an enterprise edition and above serverless feature
use compute resources to perform automatic background maintenance
use storage to store query results, adding to the monthly storage usage for an account

Materialized Views can be created on top of ____________ _____________ to improve their query performance.

Study These Flashcards

external tables

Materialized Views are limited in the following ways:

Study These Flashcards

Can only query a single table
Cannot use JOIN
Cannot use UDF, HAVING, ORDER BY, LIMIT, WINDOW FUNCTIONS

Clustering

Study These Flashcards

a way describing the distribution of a column’s values

Automatic Clustering

Study These Flashcards

Snowflake supports specifying one or more table columns/expressions as a clustering key for a table
Clustering aims to co-locate data of the clustering key in the same micro-partitions
Clustering improves performance of queries that frequently filter or sort on the clustered keys
Clustering should be reserved for large tables in the multi-terabyte range

When choosing a clustering key, Snowflake recommends a maximum of __ or __ columns (or expressions) per key.

3, 4

A __________ cardinality column would be something like a unique ID or a timestamp down to a millisecond.

high

A ________ cardinality would be something like country or gender.

Low

When choosing a clustering key, a cardinality that is too _____ will negatively impact pruning because the values are too unique to be grouped in micro-partitions.

high

If multiple columns are chosen to be the clustering key, it is suggested that the ______ cardinality is selected first followed by the ________ cardinality.

lowest, highest

Clustering Key Code Examples

As _____ operations are performed on a clustered table, the data in the table might become less clustered.

DML

________________ is a background process which transparently reorganizes data in the micro-partitions by the clustering key.

reclustering

Initial clustering and subsequent reclustering operations consume __________ and ________ credits.

clustering, reclustering, compute, storage

Clustering is recommended for large tables which do not frequently _________ and are frequently ___________.

change, queried

__________ ____________ ___________ is a table level property aimed at improving the performance of selective point lookup queries.

Search optimization service

Performance Concepts: Query Optimization Flashcards

(35 cards)