2025 Continuous Data Pipelines Flashcards

1
Q

In general, what are the three ways to load data

A

Batches
Micro Batches
Near Real Time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What options are used to enable Snowflake to do a continuous data pipeline

A

Continuous data loading
Change data tracking
Recurring Tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Streams can be created to query change data on what objects

A

Tables
Views, including Secure Views
Directory Tables
External Tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What additional columns appear when quering a stream

A

METADATA$ACTION (Insert or delete)
METADATA$ISUPDATE (true or false)
METADATA$ROW_ID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What advances the offset value in a stream

A

When any stream is used in a DML transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When a record is updated, how is it reflected in the stream

A

Two records for the change.
One where ACTION is delete and UPDATE is true
The other ACTION is insert and update is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Where does change tracking need to be enabled to put a stream on a view

A

The view and the underlying tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the three types of streams

A

Standard
Append Only (update and delete not recorded)
Insert Only (external tables only, records inserts)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does it mean if the stream is stale

A

The offset of a stream is outside the data retention period. You cannot access historical data for the source table and you will need to create a new stream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

After how many days, if a stream is not consumed, the table retention period up to the stream offset

A

14

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If both parameters DATA_RETENETION_TIME_IN_DAYS AND MAX_DATA_EXTENSION_TIME_IN DAYS is defined, which is used

A

The one where the retention period of the stream would be the highest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Streams on what Snowflake objects do not have a retention period

A

directory tables
external tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How long do you have to use RESULT_SCAN

A

24 hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the default state when a task is created?

A

Suspend

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two types of tasks

A

Serverless
Tasks managed by users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you make a task serverless

A

You leave out the warehouse

17
Q

What function indicates if a stream has data or not

A

SYSTEM$STREAM_HAS_DATA