D3.5 - EXPLAIN HOW TO WORK AND LOAD SEMI-STRUCTURED DATA Flashcards
1
Q
What are the supported file formats, data types and sizes in Snowflake?
A
- Semi-Structured Data:
- JSON
- ORC
- AVRO
- XML
- Parquet
2
Q
What is the VARIANT column?
A
- Has a 16MB limit on each individual row
- Useful when you are unsure of what data type to assign semi-structured data
- Data only uses native str and integer types
3
Q
What are the steps for loading semi-structured data into relational tables?
A
- Load semi-structured data into a single VARIANT column
- Extract and transform columns from semi-structured data into separate columns in target tables(s)
- Detect and retrieve the column definitions from staged and semi-structured data files. Create Snowflake tables, external tables, or views from the column definitions. To save time, create tables with the columns definitions automatically retrieved from the staged file
4
Q
What is the FLATTEN function?
A
- The FLATTEN function explodes nested values into separate columns.
- You can use the function to filter query results in a WHERE clause
5
Q
How to use FLATTEN when working with unfamiliar semi-structured data?
A
- You can use the FLTTEN function with the RECURSIVE argument to return the list of distinct key names nested in all nested elements in an object