SnowFlake University - Snowflake Intro Flashcards

1
Q

Databases with tables made up of rows and columns are called

A

RDBMS’s

Relational
Database
Management
Systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Tables are situated within

A

Schemas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are schemas?

A

Schemas are logical and organizational buckets/objects; a schema can have multiple tables and views.

Schemas are listed in parentheses after the table name.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Common RDBMS objects:

A

Tables

Views

Schemas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 3 parts of Snowflake Worksheet?

A
  1. Navigational Tree Area (half of it turns to a properties panel when a table is chosen)
  2. SQL Entry Pane
  3. Result/Preview Pane
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Warehouse provide…

A

the processing power to execute a command

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

To set a worksheet’s context using code, type

A

USE WAREHOUSE [warehouse name];

into the SQL Pane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Turning on/off the Code Highlight shortcut is….

and what does Code Highlight do?

A

ctrl + shift + k

It highlights the statement you click on, and only executes it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Shortcut to running the whole SQL code is

A

ctrl + shift + Enter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which editions of Snowflake has multi-cluster available?

A

Snowflake’s Enterprise Edition product (or above)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sometimes a data file has values that contain quotes, apostrophes and/or commas.
These can cause problems when ingesting data. For that reason, some files will be…

A

formated so that fields that have string values are enclosed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When creating a file format, where can you indicate what encloses some of the values that contain quotes, apostrophes and/or commas?

A

The option is called:

Field optionally enclosed by…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When creating a file format, and some of the values contain quotes, apostrophes and/or commas: what are the options to enclosing them?

A
  1. None
  2. Double Quote
  3. Single Quote

Snowflake doesn’t support other control (enclosing) character, so you must deal with them differently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an ELT Process

A

Extract, Load, Transformation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is considered part of the final step in the ELT Process?

A

Updating/changing a table( i.e. Transforming it)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an ETL Process

A

Extract, Transform, and Load

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the names given to the two tables in an ETL process?

A
1st table (Extracted): Source
2nd table (Loaded): Target
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What command to use in order LOAD a table?

A

INSERT INTO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Cleaning the data up where it already sits without moving it is called:

A

UPDATE in PLACE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What command to use in order EXTRACT a table?

A

FROM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What command to use in order TRANSFORM a table?

A

REPLACE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does SQL stand for?

A

Structured Query Language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What SQL version Snowflake uses?

A

Standard SQL: ANSI 1999 and SQL: 2003 extensions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does DML stand for and what does it allow us to do?

A

Data Management Language

- let users retrieve and edit data in databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does DDL stand for and what does it allow us to do?

A

Data Definition Language

- let users creat and edi database objects such as tables, indexes, and uses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Typical clauses for DML

A

SELECT, UPDATE, INSERT, DELETE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Typical clauses for DDL

A

CREATE, ALTER, DROP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are the three layers of Snowflake architecture?

A

Cloud Services
Query Processing
Database Storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What does Snowflake utilize to offer on-demand computing power?

A

MPP (Massively Parallel Processing) compute clusters where each node in the cluster store a portion of the entire data locally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Billing is based on

A

Credits (servers/cluster) consumed per second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Data loading efficiency does/doesn’t scale w/ WH size, while query efficiency does/doesn’t?

A

doesn’t

does

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

3 Layers in which data exist:

A
  1. Organizing/removing Data (DDL)
  2. Storing/working with Data (DML)
  3. Querying Data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is a subquery?

A

A query within another query - could be used in statements such as SELECT, FROM, AND WHERE

34
Q

Types of subqueries?

A
  1. Correlated
  2. Uncorrelated
  3. Scalar
  4. Non-Scalar
35
Q

What does correlated subquery mean?

A

Refers to one or more columns from the outer query

36
Q

What does uncorrelated subquery mean?

A

Has no reference to the outer query

37
Q

What does scalar subquery mean?

A

returns a single value, it is not a correlated subquery, because it can be executed independently

38
Q

What does non-scalar subquery mean?

A

Return 0 or more values (multiple rows and columns)

39
Q

In the warehouse analogy what are the following called and what are they called in Snowflake?
Workers
Goods

A

Workers - Servers - Compute

Goods - Data - Storage

40
Q

What problems are caused by separating data into data marts?

A

Getting integrated data (from another department or data mart) is inefficient

41
Q

What is the temporary solution used to solve data marts issue?

A

Keeping copies of subsets of data in the various data mart– but that leads to numbers that does not match across departments

42
Q

Snowflake section off and silos data and thus there is a need to replicate the data?

A

NO

43
Q

How is Snowflake warehouse not the same as data marts?

A

Every wawrhouse has access to all you data, all the time

44
Q

How many clusters of servers are there in a Snowflake warehouse?

A

1

45
Q

When changing warehouse sizes from small to medium you are…

A

adding servers, not clusters

46
Q

Scaling up/down means

A

Changing the size of a warehouse

47
Q

Scaling out is sometimes called

A

Elastic data warehousing

48
Q

Why is calling out sometimes called elastic data warehousing?

A

because the warehouse will stretch out to take extra work and then snap back to its original size once the work is done

49
Q

A warehouse can be stretched to up to

A

10 clusters during peak workloads

50
Q

Elastic warehouses are also called

A

Multi-cluster warehouses

51
Q

What are the three things warehouse power is used for:

A

loading data
uploading data
querying data

52
Q

What does the warehouse not do:

A

contain or store data

53
Q

What is a Stage or Staging Area?

A

A place to put things temporarily before moving them to a more stable location

54
Q

What are stages in Old School Data Warehouses?

A

A middle stop bw transactions (OLTP) and Reporting (OLAP)

55
Q

What are stages in Snowflake?

A

Cloud folders or directories where you place files so that Snowflake services can pick them up and pull them up in to a database.

56
Q

What is FTP?

A

File Transfer Protocol

57
Q

What are the two types of stages in Snowflake?

A

Internal stages

external stages

58
Q

What are inside stages in Snowflake?

A

they act like directories inside a Snowflake account’s local storage

59
Q

What are external stages in Snowflake?

A

Act like a secure gateway between cloud storage services and Snowflake services

60
Q

What are the big three cloud services?

A

Amazon S3
Google Cloud Platform (GCP)
Microsoft Azure BLOB Storage

61
Q

What are the three needed components of an external stage?

A
  1. Cloud storage location
  2. Cloud storage access credentials (ex. IAM User & Policy for S3)
  3. Stage Definition (a stage object that contains references to those two things)
62
Q

What does pre-stage data movement:

A

where data is coming from sources external to Snowflake (or cloud service)

63
Q

What are the 4 applications provided by Amazon for pre-stage data movement?

A
  1. FTP tool called Amazon Transfer for SFTP
  2. Command-line Interface: Amazon CLI
  3. Website Interface
  4. Could also load data pragmatically (ex. Python)
64
Q

T/F: Snowflake tracks whether a file has been loaded and doesn’t let you load it twice by accident.

A

True

65
Q

What code to use if one wants to force a file to reload, which was already loaded?

A

FORCE = TURE
at the end of the COPY INTO statement

But this will result in having double the number of rows

66
Q

How to query a data that doesn’t have column name?

A

We select them using a dollar sign and their sequence number

67
Q

List data type trend levels from smaller to larger data quantity supported + examples

A

Structured Data - Tables
Semi-Structured Data
Quasi-Structured Data - Clickstream
Unstructured Data - Images, PDFs, Videos

68
Q

Entity Relationship Diagrams (ERD)

A

They show relationship between different entities/tables, such as “authors” and “books” (logical infrastructure)

69
Q

What are the following match to in a database:
Entity
Attribute
Value

A

Table
Column
Rows

70
Q

Normalizing the data means (3 things):

A
  1. Different entities in separate tables
  2. Same info not repeated unnecessarily
  3. Unique IDs for each row
71
Q

Creating a sequence to

A

act as a counter that give unique ID to each record

72
Q

How to create a sequence

A

Through Wizard Web UI:

Go to Databases and click on Sequences in ribbon

73
Q

What code line would show you the next value in a sequence?

A

SELECT sequence_name.nextval;

Every time you refer to the Sequence’s Next Value in a select statement, it will increment. And thus, it is recommended that you reset the counter before you use the value in a table

74
Q

the 5 semi-structured data supported by Snowflake:

A
  1. JSON - JavaScript Object Notation
  2. XML
  3. Parquet
  4. Avro
  5. ORC
75
Q

Name the data type used when setting up a table column that will be home to semi-structured data?

A

Variant

76
Q

What is an Object Model?

A

It is the nesting pattern of any dataset

77
Q

In JASON, attributes are called:

A

Keys

78
Q

Key-Value Pair in JASON syntax is

A

A term used when referring to a key and its value

79
Q

In JASON, every entity is surrounded by…

A

curly braces

80
Q

The key and the value in a key-value pair are separated by…

A

a colon :

81
Q

Between each key-value pairs there is a…

A

a comma

82
Q

JASON syntax to retrieve a nested value

A

main_table:nested_table[sequence number of key].key

Sequence of keys start by 0, aka the first key in a nested table has the value of 0