DP-600 Flashcards
What are Shortcuts in Fabric?
Shortcuts enable the querying of remote data without having to move the data. They are only supported in a Fabric Lakehouse.
What is the recommended method for ingesting a large data source without applying transformations?
A pipeline with the Copy data activity. Notebooks are recommended for complex data transformations, whereas Dataflow Gen2 is suitable for smaller data and/or specific connectors.
What method should you use to ingest data when you want to copy data directly between a supported source and a destination without applying any transformations?
The Copy activity in a pipeline. Dataflow Gen2 or Spark notebooks should be used when you must apply data transformations.
What is the fastest refresh interval that can be configured for dataflows?
30 minutes (just like semantic models)
When is the command PRIMARY KEY supported?
Only when NONCLUSTERED and NOT ENFORCED are both used.
What does it mean to denormalize data?
A process of trying to improve the read performance of a database, at the expense of losing some write performance, by adding redundant copies of data or by grouping data.
What code do you use in a PySpark notebook to add new columns to a DataFrame?
withColumn
What does using OPTIMIZE to apply V-Order do?
Each table in a lakehouse has a setting that must be turned on to optimzean apply the V-Order, which will greatly increase the Direct Lake speeds when connecting to these tables.
What is a Parquet file?
Apache Parquet is a file format designed to support fast data processing for complex data.
What does the LAG function do?
It accesses the data from a previous row in the same result set by using a given physical offset. The offset argument represents the number of rows returned from the current row from which to obtain a value. Offset cannot be a negative value.
What is the only tool you can use to add a calculation group to a lakehouse semantic model?
Tabular Editor 2/3
What tool can you use to conform date formats for a bunch of columns as quickly as possible?
Tabular Editor. It supports semantic model modifications and saving back these changes to the semantic model, whereas DAX studio and VertiPaq Analyzer support only read-only operations. ALM Toolkit is used for schema comparison for semantic models.
Which three storage modes should you use for each type of table (Aggregated table, detailed fact table and dimension table)?
- Aggregated tables: Import mode
- Detailed fact tables: DirectQuery mode
- Dimension tables: Dual mode
What are calculation groups good for?
They are an efficient way to reduce the number of measures in the semantic model by grouping common measure expression. The main benefit of using calculation groups is to reduce the overall number of measures that must be created and maintained.
What tool can you use to create partitions in the Power BI service without processing them?
Tabular Editor! You can run a Apply Refresh Policy command on a table that has an incremental refresh policy defined in PBI Desktop. This will create the partitions based on the policy but does not process them. This method is useful when working with very large datasets where the initial full load can take many hours.
Which external tool can you use to get information about the size of each table and column in a model?
DAX Studios. It can connect to a model in Microsoft Power BI Desktop or the Power BI service and provide statistics on the table sizes and each column.
!! Explain incremental refresh and how it works
Incremental refresh looks for the following two parameters that are reserved keywords and case sensitive: RangeStart and RangeEnd.
You need to load data to a table without rebuilding hierarchies and relationships and without recalculating calculated columns and measures. Which process mode should you use?
Process Data. It loads data to a table without rebuilding hierarchies or relationships or recalculating calculated columns and measures.
What does process mode Process Default do?
With Process Default, hierarchies, calculated columns, and relationships are built or rebuilt (recalculated).
What does process mode Process Defrag do?
Process Defrag defragments the auxiliary table indexes.
What does process mode Process Recalc do?
It recalculates hierarchies, relationships, and calculated columns on a database level.
What makes the F64 license cost efficient?
Starting with the F64 license, report consumers can use a free per-user license.
What does disabling the XMLA endpoints do?
It ensures that semantic models can be connected to, but not edited directly in, workspaces.
What does saving your work as a Power Bi Project (PBIB) do?
It enables you to save the work as individual plain text files in a simple, intuitive folder structure, which can be checked into a source control system such as Git. This will enable multiple developers to work on different parts of the model simultaneously.