D3.1 - OUTLINE DIFFERENT COMMANDS USED TO LOAD DATA AND WHEN THEY SHOULD BE USED Flashcards

1
Q

COPY INTO

A
  • Unloads data from a table or query into one or more files in one of the following locations:
  • Named internal stage (files can then be downloaded using the GET command)
  • Named external stage (Amazons S3, GCP, Azure)
  • External location (cloud location)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

INSERT

A
  • Updates a table by inserting one or more rows into the table.
  • The values inserted into each column in the table can be explicitly-specified or the results of a query
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

PUT

A
  • Uploads data files from a local directory/folder on a client machine to one of the following snowflake stages:
  • Named internal stage
  • Internal stage for a specified table
  • Internal stage for the current user
  • Once files are staged, the data in the files can be loaded into a table using the COPY INTO command
  • PUT does not support uploading files to external stages. To upload files to external stages, you must use the utilities provided by the cloud service
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

GET

A
  • Downloads from snowflake stages to a local directory/folder on a client machine
  • It is typically executed after a COPY INTO command has copied data to the snowflake stage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

VALIDATE

A
  • Validates the files loaded in a past execution of the COPY INTO command and returns all the errors encountered during the load, rather just the first error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is SNOWPIPE?

A
  • Snowpipe loads data from files as soon as they are available in a stage.
  • The data is loaded according to the COPY statement defined in a reference pipe.
  • A pipe is a named, first class Snowflake object that contains a COPY statement used by Snowpipe
  • The COPY statement identifies the source location for the data files (i.e. stage) and a target table.
  • All data types are supported, including semi-structured data types such as JSON and Avro
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When using Snowpipe what are the different mechanisms for detecting staged files are available?

A
  • Automating Snowpipe using cloud messaging

- Calling Snowpipe REST endpoints

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Snowpipe: Automated Cloud Messaging

A
  • Automated data loads leverage event notifications for cloud storage to inform Snowpipe of the arrival of new data files to load.
  • Snowpipe copies the files into a queue, from which they are loaded into the target table in a continuous, serverless fashion based on parameters defined in a specified pipe object
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Snowpipe: Calling Snowpipe REST endpoints

A
  • Your client applications calls a public REST endpoint with the name of a pipe object and a list of data filenames.
  • IF new data files matching the list are discovered in the stage referenced by the pipe object, they are queued for loading.
  • Snowflake-provided compute resources load data from the queue into a Snowflake table based on parameters defined in the pipe
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Snowpipe: Calling Snowpipe REST endpoints

A
  • Your client applications calls a public REST endpoint with the name of a pipe object and a list of data filenames.
  • IF new data files matching the list are discovered in the stage referenced by the pipe object, they are queued for loading.
  • Snowflake-provided compute resources load data from the queue into a Snowflake table based on parameters defined in the pipe
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the primary differences between Snowpipe and a bulk data load workflow using the COPY command?

A
  • Authentication
  • Load History
  • Transactions
  • Compute Resources
  • Cost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Snowpipe vs Bulk Data Load: Authentication

A
  • Bulk Data Load: relies on the security options supported by the client for authenticating and initiating a user session
  • Snowpipe: when calling the REST endpoint’s: requires key pair authentication with JSON web token (JWT). JWTs are signed using a public/private key pair with RSA encryption
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Snowpipe vs Bulk Data Load: Load History

A
  • Bulk Data Load: stored in the metadata of the target table for 64 days. Available upon completion of the COPY statement as the statement output
  • Snowpipe: stored in the metadata of the pipe for 14 days. Must be requested from Snowflake via a REST endpoint, SQL table function, or ACCOUNT_USAGE view.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Snowpipe vs Bulk Data Load: Transactions

A
  • Bulk Data Load: loads are always performed in a single transaction. Data is inserted into a table alongside any other SQL statements submitted manually by users.
  • Snowpipe: Loads are combined or split into a single or multiple transactions based on the number and size of the rows in each data file. Rows of partially loaded files (based on the ON_ERROR copy option setting) can also be combined or split into one or more transactions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Snowpipe vs Bulk Data Load: Compute Resources

A
  • Bulk Data Load: requires a user-specified warehouse to execute COPY statements
  • Snowpipe: Uses Snowflake-supplied compute resources
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Snowpipe vs Bulk Data Load: Cost

A
  • Bulk Data Load: billed for the amount of time each virtual warehouse is active
  • Snowpipe: billed according to the compute resources used in the Snowpipe warehouse while loading the files