Snowflake Snowpro Core: Certification Exam Flashcards
Search optimization is a Database-level property applied to all the tables within the database with supported data types. (True/False)
- TRUE
- FALSE
FALSE
Search optimization is a table-level property and applies to all columns with supported data types. The search optimization service aims to significantly improve the performance of selective point lookup queries on tables. A point lookup query returns only one or a small number of distinct rows. A user can register one or more tables to the search optimization service.
Which of these system-defined roles can manage operations at the organization level?
- USERADMIN
- SYSADMIN
- ACCOUNTADMIN
- ORGADMIN
- SECURITYADMIN
ORGADMIN
role manages operations at the organizational level. More specifically, this role:
- Can create accounts in the organization.
- Can view all accounts in the organization (using SHOW ORGANIZATION ACCOUNTS) and all regions enabled for the organization (using SHOW REGIONS).
- Can view usage information across the organization.
If you recreate a pipe using CREATE OR REPLACE PIPE
command. What does happen to load history if the Snowpipe gets recreated?
- The pipe can not be recreated
- Snowflake still keeps load history
- The load history gets reset to empty
- The recreated Pipe still has tracks of the files loaded by the old Pipe
The load history gets reset to empty
When you recreate a pipe, if you do CREATE OR REPLACE PIPE
, that load history is reset to empty, so Snowflake doesn’t know which files we’ve already loaded.
Which command will list the pipes for which you have access privileges?
- LIST PIPES();
- SHOW PIPES;
- SHOW PIPES();
- DESCRIBE PIPES;
- LIST PIPES;
SHOW PIPES
Command lists the pipes for which you have access privileges. This command can list the pipes for a specified database or schema (or the current database/schema for the session), or your entire account.
The user access history can be found by querying the
- ACCOUNT_USAGE.ACCESS_HISTORY
view
- INFORMATION_SCHEMA.ACCESS_REPORT
view
- INFORMATION_SCHEMA.ACCESS_HISTORY
view
- ACCOUNT_USAGE.ACCESS_REPORT
view
snowflake.ACCOUNT_USAGE.ACCESS_HISTORY
view
Access History in Snowflake refers to when the user query reads column data and when the SQL statement performs a data write operation, such as INSERT
, UPDATE
, and DELETE
, along with variations of the COPY
command, from the source data object to the target data object. The user access history can be found by querying the snowflake.ACCOUNT_USAGE.ACCESS_HISTORY
view.
Which systems function can help find the overlap depth of a table’s micro-partitions?
- SYSTEM$CLUSTERING_WEIGHT
- SYSTEM$CLUSTERING_INFO
- SYSTEM$CLUSTERING_DEPTH
- SYSTEM$CLUSTERING_ALL
- SYSTEM$CLUSTERING_INFORMATION
SYSTEM$CLUSTERING_INFORMATION
SYSTEM$CLUSTERING_DEPTH
For example, if you have an EMPLOYEE table - you can run any of these queries to find the depth
SELECT SYSTEM$CLUSTERING_INFORMATION('EMPLOYEE'); SELECT SYSTEM$CLUSTERING_DEPTH('EMPLOYEE');
What would you create (UDF or Stored procedure) if you need a function that can be called as part of a SQL statement and must return a value that will be used in the statement?
- Stored Procedure
- UDF
A UDF evaluates to a value and can be used in contexts in which a general expression can be used (e.g. SELECT my_function() …).
A stored procedure does not evaluate to a value, and cannot be used in all contexts in which a general expression can be used. For example, you cannot execute SELECT my_stored_procedure()
….
The VALIDATION_MODE
parameter does not support COPY
statements that transform data during a load. (True / False)
- TRUE
- FALSE
TRUE
VALIDATION_MODE
instructs the COPY
command to validate the data files instead of loading them into the specified table; i.e., the COPY
command tests the files for errors but does not load them.
The command validates the data to be loaded and returns results based on the validation option specified:
Syntax: VALIDATION_MODE = RETURN_n_ROWS | RETURN_ERRORS | RETURN_ALL_ERRORS
-
RETURN_n_ROWS
(e.g.RETURN_10_ROWS
) - Validates the specified number of rows, if no errors are encountered; otherwise, fails at the first error encountered in the rows. -
RETURN_ERRORS
- Returns all errors (parsing, conversion, etc.) across all files specified in theCOPY
statement. -
RETURN_ALL_ERRORS
- Returns all errors across all files specified in theCOPY
statement, including files with errors that were partially loaded during an earlier load because theON_ERROR
copy option was set toCONTINUE
during the load.”
Which primary tool loads data to Snowflake from a local file system?
- Snowflake UI
- External Stage
- SnowSQL
- ETL tools
SnowSQL is the primary tool used to load data to Snowflake from a local file system. You can run it in either interactive shell or batch mode.
Pour quel objet le connecteur Kafka crée-t-il un topic ?
- Une internal stage pour stocker temporairement les fichiers de données pour chaque topic
- Un pipe pour ingérer les fichiers de données pour chaque partition de topic
- Tous ces objets
- Une table pour chaque topic. Si la table spécifiée pour chaque topic n’existe pas
All of these
The connector creates the following objects for each topic:
- One internal stage to temporarily store data files for each topic.
- One pipe to ingest the data files for each topic partition.
- One table for each topic. If the table specified for each topic does not exist, the connector creates it; otherwise, the connector creates the RECORD_CONTENT
and RECORD_METADATA
columns in the existing table and verifies that the other columns are nullable (and produces an error if they are not).
How can we turn off the query result cache? Query result cache can be turned off by:
- Setting the parameter USE_QUERY_CACHED
to FALSE
- Setting the parameter USE_CACHED_INFO
to FALSE
- Setting the parameter USE_CACHED_RESULT
to FALSE
Setting the parameter USE_CACHED_RESULT
to FALSE
We can turn off the query result cache by setting the parameter USE_CACHED_RESULT
to FALSE
. Though the only reason we would really want to do this is if we are doing performance testing.
Which of these are kind of Cache in Snowflake?
- Metadata Cache
- Query Result Cache
- All of these
- Data/Local Disk Cache
All of these
Snowflake has three types of cache.
- The Metadata Cache that lives in the cloud services layer.
- The Data/Local Disk Cache that lives on the SSD drives in the virtual warehouses, and
- The Query Result Cache. If a result is small, it will be stored in the cloud services layer, but larger results are going to be stored in the storage layer.
What all options are available for data transformation while loading data into a table using the COPY command? (Select all that apply)
- Truncation of Text Strings
- Join
- Column reordering
- Column omission
- Casts
Snowflake supports transforming data while loading it into a table using the COPY
command. Options include:
- Column reordering
- Column omission
- Casts
- Truncating text strings that exceed the target column length
What are the supported file formats for data unloading in Snowflake?
- Avro
- ORC
- JSON
- Parquet
- XML
JSON, Parquet for semi-structured data.
Delimited (CSV, TSV, etc.) for structured data.
A user can be assigned multiple roles. (True / False)
- FALSE
- TRUE
TRUE
Roles are the entities to which privileges on securable objects can be granted and revoked. Roles are assigned to users to allow them to perform actions required for business functions in their organization. A user can be assigned multiple roles. It allows users to switch roles (i.e., choose which role is active in the current Snowflake session) to perform different actions using separate sets of privileges.
Quelles options de copie peuvent aider à charger un fichier dont les métadonnées ont expiré (si la date LAST_MODIFIED
est antérieure à 64 jours et que l’ensemble initial de données a été chargé dans la table plus de 64 jours auparavant (et si le fichier a été chargé dans la table, cela s’est également produit plus de 64 jours auparavant)) ? (Sélectionner 2)
LOAD_FILES = TRUE
LOAD_CERTAIN_FILES = TRUE
FORCE = FALSE
LOAD_UNCERTAIN_FILES = TRUE
FORCE = TRUE
ON_ERROR = CONTINUE
FORCE = TRUE
LOAD_UNCERTAIN_FILES = TRUE
To load files whose metadata has expired, set the LOAD_UNCERTAIN_FILES
copy option to true. The copy option references load metadata, if available, to avoid data duplication, but also attempts to load files with expired load metadata. Alternatively, set the FORCE
option to load all files, ignoring load metadata if it exists. Note that this option reloads files, potentially duplicating data in a table.
Which object parameter can users with the ACCOUNTADMIN
role use to set the default retention period for their account?
- DATA_RETENTION_TIME_IN_HOURS
- DATA_RETENTION_TIME_IN_DAYS
- DATA_RETENTION_IN_TIME_TRAVEL
- DATA_RETENTION_TIME_MAX
Users can use the DATA_RETENTION_TIME_IN_DAYS
object parameter with the ACCOUNTADMIN
role to set the default retention period for their account.
Snowflake automatically and transparently maintains materialized views. (True/False)
- TRUE
- FALSE
TRUE
Snowflake automatically and transparently maintains materialized views. A background service updates the materialized view after changes to the base table. This is more efficient and less error-prone than manually maintaining the equivalent of a materialized view at the application level.
Which of these SQL functions does Snowflake support? (Select all that apply)
- Table
- Scalar
- Window
- System
- User-Defined
- Aggregate
Snowflake Supports all these SQL functions.
- Table
- Scalar
- Window
- System
- User-Defined
- Aggregate
Monica ran a SELECT
query on a large table t1. The query took longer than expected. She looked into the query profile and found that ‘ Bytes spilled to local storage’ and ‘Bytes spilled to remote storage’ are very high. What advice will you give to her to improve the query performance? (Select 3)
- Processing data in larger batches
- Increasing the number of parallel queries running in the warehouse
- Trying to split the processing into several steps
- Using a larger warehouse (effectively increasing the available memory/local disk space for the operation)
- Processing data in smaller batches
- Trying to split the processing into several steps
- Using a larger warehouse (effectively increasing the available memory/local disk space for the operation)
- Processing data in smaller batches
When Snowflake warehouse cannot fit an operation in memory, it starts spilling (storing) data first to the local disk of a warehouse node and then to remote storage. In such a case, Snowflake first tries to temporarily store the data on the warehouse’s local disk. As this means extra IO operations, any query that requires spilling will take longer than a similar query running on similar data that is capable to fit the operations in memory. Also, if the local disk is insufficient to fit the spilled data, Snowflake further tries to write to the remote cloud storage, which will be shown in the query profile as “Bytes spilled to remote storage”.
The spilling can’t always be avoided, especially for large batches of data, but it can be decreased by:
- Reducing the amount of data processed. For example, by trying to improve partition pruning or projecting only the columns that are needed in the output.
- Decreasing the number of parallel queries running in the warehouse.
- Trying to split the processing into several steps (for example, by replacing the CTEs with temporary tables).
- Using a larger warehouse - effectively means more memory and more local disk space.
What all locations do Snowflake support for staging the data? (Select all that apply)
- Amazon S3
- Oracle Cloud Storage
- Google Cloud Storage
- Microsoft Azure Blob Storage
- Snowflake Internal - Stages
Snowflake supports loading data from files staged in any of the following locations (except Oracle Cloud Storage), regardless of the cloud platform for your Snowflake account:
- Internal (i.e. Snowflake) stages
- Amazon S3
- Google Cloud Storage
- Microsoft Azure blob storage
An account-level resource monitor overrides the resource monitor assignment for individual warehouses. (True/False)
- TRUE
- FALSE
FALSE
An account-level resource monitor does not override resource monitor assignments for individual warehouses. If either the account resource monitor or the warehouse resource monitor reaches its defined threshold and a suspend action has been defined, the warehouse is suspended.
The major benefits of defining Clustering Keys: (Select 2)
- To help in organizing small tables (<1 GB)
- To help optimize table maintenance
- To help in faster data sharing
- To help improve query performance
- To help optimize table maintenance
- To help improve query performance
Defining clustering keys for very large tables (in the multi-terabyte range) helps optimize table maintenance and query performance. Small tables are not a good candidate for clustering.
Which of these are not supported by the Search Optimization Service? (Select all that apply)
- External Tables
- Casts on table columns
- Columns defined with
COLLATE
clause - Column Concatenation
- Analytical Expressions
- Materialized Views
- External Tables
- Casts on table columns
- Columns defined with
COLLATE
clause - Column Concatenation
- Analytical Expressions
- Materialized Views
None of these are currently supported by the Search Optimization Service. Additionally, Tables and views protected by row access policies cannot be used with the Search Optimization Search.