DP-600 Flashcards

1
Q

What are Shortcuts in Fabric?

A

Shortcuts enable the querying of remote data without having to move the data. They are only supported in a Fabric Lakehouse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the recommended method for ingesting a large data source without applying transformations?

A

A pipeline with the Copy data activity. Notebooks are recommended for complex data transformations, whereas Dataflow Gen2 is suitable for smaller data and/or specific connectors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What method should you use to ingest data when you want to copy data directly between a supported source and a destination without applying any transformations?

A

The Copy activity in a pipeline. Dataflow Gen2 or Spark notebooks should be used when you must apply data transformations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the fastest refresh interval that can be configured for dataflows?

A

30 minutes (just like semantic models)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When is the command PRIMARY KEY supported?

A

Only when NONCLUSTERED and NOT ENFORCED are both used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does it mean to denormalize data?

A

A process of trying to improve the read performance of a database, at the expense of losing some write performance, by adding redundant copies of data or by grouping data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What code do you use in a PySpark notebook to add new columns to a DataFrame?

A

withColumn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does using OPTIMIZE to apply V-Order do?

A

Each table in a lakehouse has a setting that must be turned on to optimzean apply the V-Order, which will greatly increase the Direct Lake speeds when connecting to these tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Parquet file?

A

Apache Parquet is a file format designed to support fast data processing for complex data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the LAG function do?

A

It accesses the data from a previous row in the same result set by using a given physical offset. The offset argument represents the number of rows returned from the current row from which to obtain a value. Offset cannot be a negative value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the only tool you can use to add a calculation group to a lakehouse semantic model?

A

Tabular Editor 2/3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What tool can you use to conform date formats for a bunch of columns as quickly as possible?

A

Tabular Editor. It supports semantic model modifications and saving back these changes to the semantic model, whereas DAX studio and VertiPaq Analyzer support only read-only operations. ALM Toolkit is used for schema comparison for semantic models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which three storage modes should you use for each type of table (Aggregated table, detailed fact table and dimension table)?

A
  1. Aggregated tables: Import mode
  2. Detailed fact tables: DirectQuery mode
  3. Dimension tables: Dual mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are calculation groups good for?

A

They are an efficient way to reduce the number of measures in the semantic model by grouping common measure expression. The main benefit of using calculation groups is to reduce the overall number of measures that must be created and maintained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What tool can you use to create partitions in the Power BI service without processing them?

A

Tabular Editor! You can run a Apply Refresh Policy command on a table that has an incremental refresh policy defined in PBI Desktop. This will create the partitions based on the policy but does not process them. This method is useful when working with very large datasets where the initial full load can take many hours.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which external tool can you use to get information about the size of each table and column in a model?

A

DAX Studios. It can connect to a model in Microsoft Power BI Desktop or the Power BI service and provide statistics on the table sizes and each column.

17
Q

!! Explain incremental refresh and how it works

A

Incremental refresh looks for the following two parameters that are reserved keywords and case sensitive: RangeStart and RangeEnd.

18
Q

You need to load data to a table without rebuilding hierarchies and relationships and without recalculating calculated columns and measures. Which process mode should you use?

A

Process Data. It loads data to a table without rebuilding hierarchies or relationships or recalculating calculated columns and measures.

19
Q

What does process mode Process Default do?

A

With Process Default, hierarchies, calculated columns, and relationships are built or rebuilt (recalculated).

20
Q

What does process mode Process Defrag do?

A

Process Defrag defragments the auxiliary table indexes.

21
Q

What does process mode Process Recalc do?

A

It recalculates hierarchies, relationships, and calculated columns on a database level.

22
Q

What makes the F64 license cost efficient?

A

Starting with the F64 license, report consumers can use a free per-user license.

23
Q

What does disabling the XMLA endpoints do?

A

It ensures that semantic models can be connected to, but not edited directly in, workspaces.

24
Q

What does saving your work as a Power Bi Project (PBIB) do?

A

It enables you to save the work as individual plain text files in a simple, intuitive folder structure, which can be checked into a source control system such as Git. This will enable multiple developers to work on different parts of the model simultaneously.

25
What is the main limitation of using XMLA endpoints for the Microsoft Power BI deployment process?
A PBIX file cannot be downloaded from the Power BI Service.
26
Wat doet een incremental refresh?
Het biedt een efficiente manier om dynamisch gegevens te verwerken en de prestaties van het vernieuwen van modellen te verbeteren. Door het maken en beheren van partities te automatiseren, vermindert incrementeel vernieuwen de hoeveelheid gegevens die moeten worden vernieuwd en kunnen realtime gegevens worden opgenomen.
27
What is a core semantic model?
Essentially a pbix without visuals that you published to the online workspace so that others can connect to it so that they are all using the samen semantic model.
28
What is lifecycle management?
Refers to the process of tracking and managing each stage of the development process. Includes creating reusable assets and deciding how to handle changes to your semantic models and reports.
29
What do Power BI Project files allow you to do?
In combination with code editors and source control systems, they allow you to track versions and manage code deployments as part of a Continuous Integration Continuous Delivery (CI/CD) system. They store your semantic model and report in individual plain text files. This also allows for collaboration.
30
Continuous Integration and Continuous Delivery (CI/CD)
Refers to the systems uses by many organizations to propose and validate changes during development before releasing them to production (PROD/ TEST/ DEV workspaces).
31
What can you do with Git integration?
Backup and version your work, revert to previous states, collaborate with others or work alone using Git branches and use the capabilities of familiar source control tools, like Azure DevOps.
32
What does a promoted endorsement mean?
Means that the item creators think the item is ready for sharing and reuse. Any user with write permissions on an item can promote it.
33
What does certified endorsement mean?
Means that an organization-authorized reviewer has certified that the item meets the organization's quality standards, can be regarded as reliable and authorative, and is ready for use across the organization.
34
What does master data endorsement mean?
Means the data in the item is a core source of organizational data. The master data designation might be used to indicate a single source of truth for certain kinds of data, such as product codes or customer lists.
35
What is an XMLA endpoint?
An API with a URL to the workspace or semantic model. By default they're read-only. Allows access to model data, metadata, events and schemas.
36
What is required in order to use XMLA endpoint for read-write operations?
The semantic model must reside in a Premium or Fabric workspace.
37
What are some common uses for XMLA endpoints?
Refreshing individual components of a data model, systematically exporting data from the data model, automating the use of the best practice analyzer
38