D4.1 - OUTLINE BEST PRACTICES FOR SNOWFLAKE PERFORMANCE MANAGEMENT ON STORAGE Flashcards
1
Q
What is micro-partitioning?
A
- The physical tables that comprise Snowflakes logical tables
- Snowflake automatically partitions the data so that users do not need to define a a partition schema
- Micro partitions are immutable and each micro-partition obtains between 50-500 MB on uncompressed data
- Snowflake stores metadata about all rows stored in a micro-partition
2
Q
How does micro-partitioning enable clustering?
A
- By creating a metadata unique to the clusters created by micro-partitioning
3
Q
What metadata is saved for clustered data?
A
- The total number of micro-partitions that comprise the table.
- The number of micro-partitions containing values that overlap with each other (in a specified subset of table columns).
- The depth of the overlapping micro-partitions.
- The unique values contained within partitions
4
Q
What is a clustering key?
A
- A clustering key is a subset of columns in a table (or expressions on a table) that are explicitly designated to co-locate the data in the table in the same micro-partitions.
- This is useful for very large tables where the ordering was not ideal (at the time the data was inserted/loaded) or extensive DML has caused the table’s natural clustering to degrade.
5
Q
What is a materialized view?
A
- A materialized view if a pre-computed data set derived from a query specification (the SELECT in the view definition) and stored for later use
- because the data is pre-computed querying the view is faster, which can be significant when a query is being run frequently` or is sufficiently complex.
- Materialized views can speed up expansive aggregation, projection, and selection operations on large datasets
6
Q
What is the benefit and cost of a materialized view?
A
- Materialized views are designed to improve query performance for workloads composed of common, repeated query patterns.
- However, materializing intermediate results incur additional costs. As such, before creating a materialized view, you should consider whether the costs are offset by the savings from re-using these results frequently enough.
7
Q
What is the ‘Search Optimization Service’?
A
- The search optimization service aims to significantly improve the performance of selective point lookup queries on tables
- To improve performance for point lookups, the search optimization service relies on a persistent data structure that serves as an optimized search access path