examtopics_1 Flashcards
C. [ManagerEmployeeKey] [int] NULL
B. an error (dbo schema error)
The answer should be D –> A –> C.
Step 1:
Create an empty table SalesFact_Work with same schema as SalesFact.
Step 2:
Switch the partition (to be removed) from SalesFact to SalesFact_Work. The syntax is:
ALTER TABLE <source></source> SWITCH PARTITION <partition> to <destination></destination></partition>
Step 3:
Delete the SalesFact_Work table.
B. File1.csv and File4.csv only
1: Parquet - column-oriented binary file format
2: AVRO - Row based format, and has logical type timestamp
D. /{SubjectArea}/{DataSource}/{YYYY}/{MM}/{DD}/{FileData}{YYYY}{MM}_{DD}.csv
1: PARQUET
Because Parquet is a columnar file format.
2: AVRO
Because Avro is a row-based file format (as JSON) which is connected to logical timestamp
- Merge files
- Parquet
All the Dim tables –> Replicated
Fact Tables –> Hash Distributed
- Cool –> You will access infrequently but data must be avaliable in few time if you want to access them
- Archive –> You will never acces them but you need to configure a data archiving solution, so you must retain them always and not delete the blob
DISTRIBUTION = HASH (id)
PARTITION (ID RANGE LEFT
FOR VALUES (1, 1000000, 2000000) )
D. as a Type 2 slowly changing dimension (SCD) table
F. Create a managed identity.
A. Add the managed identity to the Sales group.
B. Use the managed identity as the credentials for the data load process.
- 0
- Value stored in database
Answer is C. Drop the external table and recreate it.
- Binary
- PerserveHierarchy
B. read-access geo-redundant storage (RA-GRS)
ZRS: “…copies your data synchronously across three Azure availability zones in the primary region” (meaning, in different Data Centers. In our scenario this would meet the requirements)
-> D is right
GRS/GZRS: are like LRS/ZRS but with the Data Centers in different azure regions. This works too but is more expensive than ZRS. So ZRS is the right answer.
Round-robin - this is the simplest distribution model, not great for querying but fast to process
Heap - no brainer when creating staging tables
No partitions - this is a staging table, why add effort to partition, when truncated daily?
B. hash-distributed on PurchaseKey. (Hash-distributed tables improve query performance on large fact tables. The PurchaseKey has many unique values, does not have NULLs and is not a date column.)
EventCategory -> dimEvent
channelGrouping -> dimChannel
TotalEvents -> factEvent
The answer is A
Compression doesn’t not only help to reduce the size or space occupied by a file in a storage but also increases the speed of file movement during transfer
Answer is no ,u use HEAP idx
No, rows need to have less than 1 MB. A batch size between 100 K to 1M rows is the recommended baseline for determining optimal batch size capacity.