2025 Data Unloading Flashcards
What is the two step process for unloading data from Snowflake
- Unload to a stage using a COPY INTO statement
- Move from there to where it needs to go. For an internal stage, a GET command can be used. For an external stage, use the cloud provider options
How is encryption handled when unloading files
Data is encrypted while unloading to a staging area and decrypts it when downloading to a user’s local system
What encoding is used for unloaded files
UTF-8 irrespective of file format
Where can file formats be specified when unloading data
Table definition
Names stage definition
In COPY command
What is the default compression algorithm used in a COPY statement
gzip
What is an option for string conversion from SQL NULL in a COPY statement
NULL_IF = ( ‘s1’, ‘s2’…)
Will convert to first value in the mentioned list
On a COPY statement, what is EMPTY_FIELD_AS_NULL
You can convert empty string to NULL values. When set to FALSE, it can be used to unload empty strings without enclosing quotes
On a COPY statement, what does OVERWRITE= TRUE do?
Overwrites files with the same name
How do you signal you want to unload your data into a single file on a COPY statement?
SINGLE = TRUE
What is the default size for files when unloading data?
16 MB to maximize parallel processing. This can be changed by setting MAX_FILE_SIZE
When unloading data, what is the default prefix for the filenames?
data_
Then naming convention is data_<file_number>_<random_uuid>_<partition_id>.gz</partition_id></random_uuid></file_number>
When setting MAX_FILE_SIZE, what is the max option
5GB
Why would you use PARTITION BY <expression> in a COPY statement</expression>
Partitioning unloaded data into a directory structure in cloud storage can increase the efficiency for third-party tool consumption.