Data Storage Flashcards

1
Q

Awesome Company wants to create a folder structure in their landing zone that will simplify permission maintenance as much as possible. Which of the following formats accomplishes that?
A) \YY\MM\DD\DataSource\Landing
B) \Landing\DataSource\YY\MM\DD
C) \Landing
D) \YY\MM\DD

A

B) \Landing\DataSource\YY\MM\DD

Having the zone at the top level and then the data sources below that allow for permissions to be set the fewest amount of times with the greatest impact.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which Azure Storage redundancy option creates synchronous copies across 3 availability zones in the primary region?

A

ZRS

Zone-redundant storage (ZRS) creates 3 synchronous copies across Azure availability zones in the primary region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What Databricks feature allows you to skip files within a partition based on query filters?

A

Dynamic File Pruning

Dynamic file pruning utilizes data skipping and Z-Ordering to skip files within a partition based on filters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which of the following file types will allow you to store nested data in a columnar format?

A) Parquet
B) Avro
C) JSON
D) ORC

A

A) Parquet

Parquet uses nested data structures and columnar storage, where values of each column type are stored together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the storage solution for unstructured and semi-structured files that is most often used in Azure analytics solutions?

A

Azure Data Lake Storage Gen2

Functional partitioning is based on identifying bounded contexts and allows you to use these for improving isolation and performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which Azure Storage access tier is intended for long-term backup and regulatory compliance?

A

Archive

Archive access tier is for long-term storage of 180 days or more.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which sharding strategy utilizes a map to direct queries to the appropriate shard?

A

The Lookup Strategy

This sharding logic uses a map to route requests to the appropriate shard based on the shard key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What type of compression allows you to reduce the size of rowstore objects by looking for patterns in the data and making replacements with smaller values?

A

Page compression.

Page compression looks for patterns in the data and makes replacements with smaller values. This is especially useful for duplicate data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which partitioning method allows you to improve isolation and data access performance by identifying a bounded context for a distinct business area?

A

Functional Partitioning

Functional partitioning is based on identifying bounded contexts and allows you to use these for improving isolation and performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In Azure Synapse Analytics, how many underlying distributions are automatically assigned?

A

60

Azure Synapse Analytics automatically creates and manages 60 distributions for your data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly