Prepare the Data (Get Data) Flashcards
What is a Flat file?
A flat file is a type of file that has only one data table and every row of data is in the same structure.
What are the most common type of flat files?
Comma-separated values (.csv) files, delimited text (.txt) files, and fixed width files.
Another type of file would be the output files from different applications, like Microsoft Excel workbooks (.xlsx).
What locations might you file your flat or Excel files?
Local - You can import data from a local file into Power BI. The file isn’t moved into Power BI, and a link doesn’t remain to it. Instead, a new dataset is created in Power BI, and data from the Excel file is loaded into it. Accordingly, changes to the original Excel file are not reflected in your Power BI dataset. You can use local data import for data that doesn’t change.
OneDrive for Business - You can pull data from OneDrive for Business into Power BI. This method is effective in keeping an Excel file and your dataset, reports, and dashboards in Power BI synchronized. Power BI connects regularly to your file on OneDrive. If any changes are found, your dataset, reports, and dashboards are automatically updated in Power BI.
OneDrive - Personal - You can use data from files on a personal OneDrive account, and get many of the same benefits that you would with OneDrive for Business. However, you’ll need to sign in with your personal OneDrive account, and select the Keep me signed in option. Check with your system administrator to determine whether this type of connection is allowed in your organization.
SharePoint - Team Sites - Saving your Power BI Desktop files to SharePoint Team Sites is similar to saving to OneDrive for Business. The main difference is how you connect to the file from Power BI. You can specify a URL or connect to the root folder.
How do you connect Power BI to data in a file?
In Power BI, on the Home tab, select Get data. In the list that displays, select the option that you require, such as Text/CSV or XML.
What options are available in Navigator after connecting data?
Transform and Load
What does the Load option in the navigator window do?
Load - Automatically load your data into a Power BI model in its current state.
Transform Data - Open your data in Microsoft Power Query, where you can perform actions such as deleting unnecessary rows or columns, grouping your data, removing errors, and many other data quality tasks.
What does the Transform option in the navigator window do?
Allows you to review and clean your data before loading it into the Power BI Model
Where can you change the data source settings in Power BI?
Data source settings on the home tab
Query Settings
Advanced Editor
How do you connect to data in a relational database
Use the Get data feature in Power BI Desktop and select the applicable option for your relational database.
For relational databases, after you’ve added your server and database names, you will be prompted to sign in with a username and password. What 3 sign-in options will you have?
Windows - Use your Windows account (Azure Active Directory credentials).
Database - Use your database credentials. For instance, SQL Server has its own sign-in and authentication system that is sometimes used. If the database administrator gave you a unique sign-in to the database, you might need to enter those credentials on the Database tab.
Microsoft account - Use your Microsoft account credentials. This option is often used for Azure services.
Select a sign-in option, enter your username and password, and then select Connect.
How do you Import data by writing an SQL query?
On the SQL Server database window, enter your server and database names, and then select the arrow next to Advanced options to expand this section and view your options. In the SQL statement box, write your query statement, and then select OK.
How do you change data source settings?
On the Home tab, select Transform data, and then select the Data source settings option. From the list of data sources that display, select the data source that you want to update. Then, you can right-click that data source to view the available update options or you can use the update option buttons on the lower left of the window. Select the update option that you need, change the settings as required, and then apply your changes.
Power Query - Select the table, and then select the Data source settings option on the Home ribbon. Alternatively, you can go to the Query Settings panel on the right side of the screen and select the settings icon next to Source (or double Select Source). In the window that displays, update the server and database details, and then select OK.
What does an SQL query start with?
a Select statement
What does the select statement do?
Allows you to choose the specific fields that you want to pull from your database
In the select statement, how do you specify where to get the fields from?
“FROM” specifies the name of the table that you want to pull the data from.
What is a wildcard character (*) in SLQ?
When used it imports all columns that you don’t need from the specified table.
Why is it recommended not to use the wildcard character?
This method is not recommended because it will lead to redundant data in your data model, which will cause performance issues and require additional steps to normalize your data for reporting.
What SLQ clause filters the rows to pick only filtered records that you want
The “WHERE” clause
What is a NoSQL database?
A NoSQL database (also referred to as non-SQL, not only SQL or non-relational) is a flexible type of database that does not use tables to store data.
How do you connect a NoSQL database to Power BI?
Use the Get data feature in Power BI Desktop & select the More option to locate and connect to the type of database that you use.
How do you import a JSON file?
JSON type records must be extracted and normalized before you can report on them, so you need to transform the data before loading it into Power BI Desktop.
How do you connect to data in an application?
By selecting the Get data feature in Power BI Desktop. Then, select the option that you need from the Online Services category. (Example: SharePoint Online List).
After you select the Online Services Category using the get data feature and are connected, what is required to log in?
You will enter the URL used to log in (the site URL)
After you have provided the site URL to connect and get data from an Online Service, how do you sign in?
Power BI needs to authorize the connection to the online service account (SharePoint), so sign in with your Microsoft account and then select Connect.
What is the most popular way to get data into Power BI?
The most popular way to use data in Power BI is to import it into a Power BI dataset.
What is Import mode?
The Import mode allows you to create a local Power BI copy of your datasets from your data source. You can use all Power BI service features with this storage mode, including Q&A and Quick Insights. Data refreshes can be scheduled or on-demand. Import mode is the default for creating new Power BI reports.
What is DirectQuery mode?
DirectQuery storage mode allows you to query the data in the data source directly and not import a copy into Power BI. DirectQuery is useful because it ensures you are always viewing the most recent version of the data.
What is Dual (Composite mode)?
Allows you to identify some data to be directly imported and other data that must be queried. Any table that is brought in to your report is a product of both Import and DirectQuery modes. Using the Dual mode allows Power BI to choose the most efficient form of data retrieval.
What modeling capabilities aren’t available with DirectQuery?
No built-in date hierarchy: When importing data, every date/datetime column will also have a built-in date hierarchy available by default. For example, if importing a table of sales orders including a column OrderDate, then upon using OrderDate in a visual, it will be possible to choose the appropriate level (year, month, day) to use. This built-in date hierarchy isn’t available when using DirectQuery. If there’s a Date table available in the underlying source, as is common in many data warehouses, then the DAX Time Intelligence functions can be used as normal.
Date/time support only to second accuracy: When using time columns in your dataset, Power BI only issues queries to the underlying source to a level of detail of seconds. Queries aren’t sent to the DirectQuery source for milliseconds. Remove this part of the times from your source columns.
Limitations in calculated columns: Calculated columns are limited to being intra-row, as in, they can only refer to values of other columns of the same table, without the use of any aggregate functions. Additionally, the DAX scalar functions, such as LEFT(), that are allowed, are limited to those functions that can be pushed to the underlying source. The functions vary depending upon the exact capabilities of the source. Functions that aren’t supported aren’t listed in autocomplete when authoring the DAX for a calculated column, and would result in an error if used.
No support for parent-child DAX functions: When in DirectQuery mode, it’s not possible to use the family of DAX PATH() functions that generally handle Parent-Child structures, such as chart of accounts, or employee hierarchies.
Calculated tables: You can use calculated tables in DirectQuery when using composite models.
Relationship filtering: For information about bi-directional filtering, see Bidirectional cross-filtering. This whitepaper presents examples in the context of SQL Server Analysis Services. The fundamental points apply equally to Power BI.
No Clustering: When using DirectQuery, it’s not possible to use the Clustering capability, to automatically find groups.
What REPORTING limitations are there when using DirectQuery?
Quick Insights isn’t supported: Power BI Quick Insights searches different subsets of your dataset while applying a set of sophisticated algorithms to discover potentially interesting insights. Given the need for very high-performance queries, this capability isn’t available on datasets using DirectQuery.
Using Explore in Excel will likely result in poorer performance: You can explore your data by using the Explore in Excel capability on a dataset. This approach allows Pivot Tables and Pivot Charts to be created in Excel. While this capability is supported on datasets using DirectQuery, the performance is generally slower than creating visuals in Power BI, and therefore if the use of Excel is important for your scenarios, this fact should be accounted for in your decision to use DirectQuery.
Hierarchies are not shown in Excel: When connecting using DirectQuery from Excel to an Azure Analysis Services model or Power BI dataset, for example using Analyze in Excel, any hierarchies defined in the model or dataset are not shown.
Maximum length for text columns: The maximum length of the data in a text column for datasets using DirectQuery is 32,764 characters. Reporting on longer texts than that will result in an error.
Dashboards connected with DirectQuery can be scheduled to refresh how often?
By default, datasets refresh every hour but can be configured as part of dataset settings to be between weekly and every 15 minutes.
What is the time out that is applied to individual queries in the Power BI service?
A time-out of four minutes is applied. Queries taking longer than that will fail
What is Azure Analysis Services?
Azure Analysis Services is an Azure product that allows you to ingest data from multiple data sources, build relationships between the data, and creates calculations on the data.
How are the calculations built-in Azure Analysis Services?
The calculations are built using data analysis expressions (DAX). Azure Analysis Services is similar to the data modeling and storage technology in Power BI.
What are the Notable differences between Azure Analysis Services cubes and SQL Server?
Analysis Services cubes have calculations already in the cube, which will be discussed in more detail later.
If you don’t need an entire table, you can query the data directly. Instead of using Transact-SQL (T-SQL) to query the data, like you would in SQL Server, you can use multi-dimensional expressions (MDX) or data analysis expressions (DAX).
How is Azure Analysis Services cubes is similar to getting data from SQL Server?
You can:
Authenticate to the server.
Pick the cube you want to use.
Select which tables you need.
How do you connect to data in Azure Analysis Services ?
You use the Get data feature in Power BI Desktop.
What is Connect live?
Connect Live is a new option in Azure Analysis Services that helps you keep the data and DAX calculations in their original location, without having to import them all into Power BI.
How do refresh schedules work with Azure Analysis Services?
Azure Analysis Services can have a fast refresh schedule, which means that when data is refreshed in the service, Power BI reports will immediately be updated, without the need to initiate a Power BI refresh schedule.
This process can improve the timeliness of the data in your report.