PBI Flashcards
When can you import local files in Power BI
For data that does not change
Why might it be necessary to return and change connection settings in Power BI Desktop after creating a data source connection and loading data?
Due to changes in the organization’s security policy, such as updating passwords
What are the possible actions that can be taken when returning to change connection settings in Power BI Desktop after loading data?
data source, edit permissions or clear permissions.
When working in Power BI, what is the recommended best practice for optimizing data retrieval and usage from a relational database?
Writing a query in a view, similar to a table in a relational database
NoSQL (non-relational) databases
flexible database systems that do not use tables to store data
Preview pane for cosmo db (no sql db) shows
The preview pane only shows Record items because all records in the document are represented as a Record type in Power B
Importing the data in to Power BI means
that the data is stored in the Power BI file and gets published along with the Power BI reports
Why might you NOT want to use import mode in PBI
Security Reasons
Why is direct query useful?
because it ensures you’re always viewing the most recent version of the data.
The three different types of storage modes you can choose from:
Import
DirectQuery
Dual (Composite)
How to switch storage mode
Model view -> select a table -> Properties pane -> storage mode drop down
Import mode (3 attributes)
- Create a local PBI Copy of you semantic models
- Can use all PBI Services including Q&A and Quick Insights
- Data refreshed can be scheduled or on demand
DirectQuery Mode Attributes
- Useful when you dont want to save local copies of data (Data wont be cached)
- Direct connection to the data source
- Viewing most up to date data
- Suited for when you have large semantic models (Improved performance)
Dual (Composite mode)
- Can ID some data to be directly imported and other data the must be querried
- Limited M transformations
- Restricted DAX functions in calculated columns
- Same constraints as directquery
Azure Analysis Services
fully managed platform as a service (PaaS) that provides enterprise-grade semantic models in the cloud
differences between Azure Analysis Services and SQL Server are:
- Analysis Services models have calculations already created
- Can query the data directly using MDX or DAX
Fast Refresh Schedule Analysis Services
When data is refreshed in the service Power BI Reports will immediately be updated
The performance of Power Query depends on
The performance at the data source level
Power Query takes advantage of good performance at the data source through
Query Folding
Query folding
the process by which the transformations and edits that you make in Power Query Editor are simultaneously tracked as native queries, or simple Select SQL statements, while you’re actively making transformations
The benefits to query folding include:
- More efficiency in data refreshes and incremental refreshes.
- Automatic compatibility with DirectQuery and Dual storage modes.
If the View Native Query option isn’t available
then query folding isn’t possible for this step, and you’ll have to work backward in the Applied Steps area until you reach the step in which View Native Query is available (displays in bold type).
Native queries aren’t possible for the following transformations:
1.Adding an index column
2. Merging and appending columns of different tables with two different sources
3. Changing the data type of a column
Other ways to optimize query performance in Power BI include:
1.Process as much data as possible in the original data source.
2.Use native SQL queries.
3. Separate date and time, if bound together.
What is a query timeout in relational source systems?
Relational source systems often implement query timeouts to prevent a single user from monopolizing hardware resources. These timeouts limit the duration a query can run and can be configured for various timespans, from seconds to minutes or more. The purpose is to manage concurrent usage of data in the same database.
Power BI Query Error: Timeout expired
indicates that you’ve pulled too much data according to your organization’s policies.
Ways to resolve “Power BI Query Error: Timeout expired”
- Pull in fewer columns or rows
- Reduce complexity of SQL
- Bring data in small chunks and combine using Power Query
We couldn’t find any data formatted as a table error
When importing data from excel power bi may not be able to find a table
A calculated table formula must return
a table object
Calculated tables have a cost
- Increase the model storage size
- Prolong data refresh time
A calculate column must return
a scalar or single value
calculated columns in import models have a cost
- Increase the model storage size
- prolong the data refresh time
Row Context
The current row
Referencing a column in another table when the two tables are not related
Use the LOOKUPVALUE dax function
Relatedtable
Retrieves values on the many-side of a relationship and returns a table object
Iterator functions
provide you with flexibility to create sophisticated summarizations
Three ways to add columns
- Add columns to a view or table as a persisted column
- Add custom columns (using M) to Power Query queries
- add calculated columns (using DAX) to model tables
Preference to add custom columns
Is in Power Query whenever possible because they load to the model in a more compact and optimal way
Unpivot columns
- Creates Attribute and Value columns
Pivot columns
convert your flat data into a table that contains an aggregate value for each unique value in a column
Best practices when naming tables, columns and measures
- Remove “_”
- Remove prefixes and suffixes
Report objects include
- Visuals
- Elements
Visuals - Report Objects
Visualizations of the semantic model data
Elements - Report Objects
Provide visual interest but dont use semantic model data (Buttons, shapes, and images)
Categorical Visual Types
- Bar charts
- Column Charts
Categorical Visuals are Good For
When you need to show data across multiple categories
You should sort categorical visuals by
The value rather than the in alphabetical category order
Time series visuals should be used when
You want to show values over time
Time Series Visual Types
- Line Chart
- Column Chart
- Stacked column chart
- Area chart
- Line and stacked column chart
- Ribbon chart
When should you use a column chart vs a line chart
Us a columns chart when there are missing values
Ribbon Chart added benefits
Shows rank changes over time
Proportional Visuals are good for
Showing data as part of a whole; How a value is distributed across a dimension
Can you plot negative and positive numbers on a proportional visual?
All number must be either All Positive or All Negative
Proportional Visuals Types
1.100% Stacked Column chart
- Funnel chart
- Treemap
- Pie chart
- Doughnut chart
Numerical Visuals show
show high level callouts that demand immediate attention
Communicate important data quickly
Grid Visuals can Convey
A lot of detailed information
Best visual for hierarchical navigation
Matrices
Performance Visuals
Communicate performance by describing a value and its comparison with a target
Core Performance Visuals
- Gauge
- KPI
- Table, with conditional formatting
- Matrix, with conditional formatting
Instead of using a map visual which takes up a lot of space you can use a
Categorical visual
Filtering can occur at five different levels of a PBI Report
- Semantic Model (RLS)
- Report
- Page
- Visual
- Measure
DAX
Data Analysis Expression
When can you not create a measure
When the model is a live connection to SQL Server Analysis Services multidimensional model
Unlike report and page-level filters, visual-level filters can filter by
using a measure
In order to reduce the load on a semantic model when applying multiple filters
add an apply button which applies the filters all at once
Advantages of using filterpane
- Consistent Location
- Allows you to configure advanced filter types
- Search box and sort functions
- Faster
- Lock and Hide filters
- Apply button
- Filter using a measure
Disadvantages of using filterpane
- Less design flexibility
- Confusion when filters are hidden and locked
- Easy to lose track of applied filters
Advantages of using a slicer
- Place them anywhere
- Highly configurable
- You can configure hierarchical slicers
- They can display their selection
- They can display images
- can selectively filter visuals
- You can filter slicers
- you can sort slicer items
- synced slicers can filter other pages in a report
Disadvantages of using a slicer
- Slicers can have an impact of report render performance
- occupy space on the page
- filtering options are less complex
Benefits of creating a workspaces
- House reports in one location
- Make them shareable
- collaborate with other teams
- Update reports
What access should you give people?
The minimum access necessary to collaborate
4 roles in a workspace
- Admin
- Member
- Contributor
- Viewer
Admin Access
- Update and delete the workspace
- add of remove people including other admins
Member access
- Add members or others with low permissions
- publish, unpublish and change permissions for an app
Contributor access
- create, edit and delete content such as reports in the workspace
- publish reports to the workspace
Viewer Access
- View and interact with an item
- Read data that’s stored in workspace dataflows
Who can view usage metrics reports
- Admin, Member, and contributor
- Must have a pro license
Advantages of using a deployment pipeline
- Increased productivity
- Faster delivery of content
- Lower human intervention required
Sensitivity Labels
Specify which data can be exported
Two types of on-premise data gateways
- Oranization mode
- Personal mode
Organization mode
Allows multiple users to connect to multiple on-premise data sources and is suitable for complex scenarios
Personal Mode
Allows one user to connect to data sources
Two ways to endorse semantic models
- Promotion
- Certification
Promotion
Promote your semantic models when they are ready for broad usage
Certification
Request certification for a promoted semantic model
Query Caching
Local caching feature that maintains results on a user and report basis
Reserved parameters for incremental refresh
- RangeStart
- RangeEnd
Permission needed for promotion and certification
- Promotion does not need specific permission
- Certification requires permission from the semantic model owner to access to the semantic model
Data Alerts
Notify you or a user that a specific data point is above, below, or at a specific threshold
Data alerts are available for
- KPI Visuals
- gauges
- cards
Q&A Feature
A tool within PBI Desktop that allows you to ask natural language questions about the data
Q&A Visual consists of three main elements
- Question box
- Pre-populated suggestion tiles
- pin visual
Quick Insights
Uses machine learning algorithms to go over your entire semantic model and produce insights for you quickly
4 actions you can perform with quick insights
- Filter the visualization
- pin the insight card to a dashboard
- Run insights on the card
- Return to the original insights canvas
2 ways of implementing RLS
- Statis
- Dynamic
Static RLS
Uses a fixed value in the DAX Filter
Dynamic RLS
Uses a dax function
Where do you go to create RLS roles in PBI Desktop?
Modeling Tab -> Manage Roles
Column distribution
Show you the distribution of the data within the column and the counts of distinct and unique values
Column Profile
- Row count
- Empty rows and strings
- Min and max
Value distribution
Graph that tells you the counts for each distinct value in that specific column
Column Statistics
On a numeric column will include how many zeroes and null values exist along with the average value, standard deviation and how many even and odd values are in the column
What is a risk of having null values in a numeric column?
That function AVERAGE of data will be incorrect.
If you have two queries that have different data but the same column headers, and you want to combine both tables into one query with all the combined rows, which operation should you perform?
Append
Which of the following selections aren’t best practices for naming conventions in Power BI?
Abbreviate column names.
Start Schema
Each table within your semantic model is defined as a dimension or a fact table