2025 Data Unloading Flashcards

1
Q

What is the two step process for unloading data from Snowflake

A
  1. Unload to a stage using a COPY INTO statement
  2. Move from there to where it needs to go. For an internal stage, a GET command can be used. For an external stage, use the cloud provider options
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is encryption handled when unloading files

A

Data is encrypted while unloading to a staging area and decrypts it when downloading to a user’s local system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What encoding is used for unloaded files

A

UTF-8 irrespective of file format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Where can file formats be specified when unloading data

A

Table definition
Names stage definition
In COPY command

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the default compression algorithm used in a COPY statement

A

gzip

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an option for string conversion from SQL NULL in a COPY statement

A

NULL_IF = ( ‘s1’, ‘s2’…)
Will convert to first value in the mentioned list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

On a COPY statement, what is EMPTY_FIELD_AS_NULL

A

You can convert empty string to NULL values. When set to FALSE, it can be used to unload empty strings without enclosing quotes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

On a COPY statement, what does OVERWRITE= TRUE do?

A

Overwrites files with the same name

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you signal you want to unload your data into a single file on a COPY statement?

A

SINGLE = TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the default size for files when unloading data?

A

16 MB to maximize parallel processing. This can be changed by setting MAX_FILE_SIZE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When unloading data, what is the default prefix for the filenames?

A

data_
Then naming convention is data_<file_number>_<random_uuid>_<partition_id>.gz</partition_id></random_uuid></file_number>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When setting MAX_FILE_SIZE, what is the max option

A

5GB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why would you use PARTITION BY <expression> in a COPY statement</expression>

A

Partitioning unloaded data into a directory structure in cloud storage can increase the efficiency for third-party tool consumption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly