Performance Management Flashcards

Question

What (quantifiable) factors can impact query queueing?

Answer 1

- QUEUED_LOAD - QUEUED_RESUMING - QUEUED_REPAIR

Answer 2

INFORMATION_SCHEMA.QUERY_HISTORY view

Answer 3

Details the current load on a warehouse, can often describe whether or not there is capacity for a new query and details whether the warehouse is overloaded. (This is the most common condition)

Answer 4

Details whether or not the warehouse is currently resuming, say from auto-suspension or a restart. A query will execute once the warehouse is available.

Answer 5

A faulty server within your warehouse is being replaced by a healthy one. This is very rare and typically only happens in the instance of hardware failure.

Answer 6

WAREHOUSE_LOAD_HISTORY() function

Answer 7

When a Snowflake cannot fit an operation into memory (like tables/data that are being used to compute a query result), this data will spill into memory. If memory and local disk becomes full, the query data will spill into remote storage. As data spills over into each new layer downstream, the slower the query becomes.

Answer 8

Descending by speed: - Memory - Local disk (SSD) - Remote storage (S3/Blob storage)

Answer 9

Profile tab in the web interface, look for the "Spilling" section which dictates the number of bytes spilled to local storage.

Answer 10

In general, the SQL query optimizer chooses the correct order for joining tables, but in rare cases it does not it may be advisable to specify/correct the order in which queries join tables. When tables are not joined in an optimal order this can impact performance.

Answer 11

Specifying the join order in the query itself won't change anything as the query optimizer will just decide for you in the backend, regardless of your specification in the query. To do so, you must write several temp tables and join them in that order.

Answer 12

This is when unnecessary partitions are being scanned to compute a query, resulting in slower execution times.

Answer 13

Improve clustering (clustering keys, etc) and pruning will improve.

Answer 14

Due to unanticipated data duplication due to joins on a query, more rows may be exploded out of a query than originally anticipated. This can exponentially increase compute and storage time/resources to complete a query.

Answer 15

Double check to ensure your join statements are correct. At times there is no way to fix this if your joins are absolutely necessary.

Answer 16

The Query Profile tool provides execution details for a query as well as a graphical representation of the main components of the plan to process a query along with statistics for each component. This tool helps identify common query performance issues mentioned in prior cards.

Answer 17

A query is divided into steps, with each step containing an Operator Tree. An Operator Tree contains the main elements/operator nodes and the relationships between each operator that comprise of a step within a query.

Answer 18

Because Snowflake allows for simple creation and replication of data warehouses, it is best to segment query workloads based on their use case. This allows for optimal caching, clustering, and concurrency. i.e the marketing department should have their own warehouse separate from finance. Each data science team should have their own warehouse specific to their objective/subject matter.

Performance Management Flashcards

(42 cards)