DP-203 Dumps Flashcards
1) You execute the following query in Azure Synapse Analytics Spark pool in workspace for the following query:
SELECT StudentID
FROM abc.dbo.myTable
WHERE name = ‘Amit’
TABLE:
StudentName: Amit
StudentID: 69
StudentStartDate: 26/05/22
What will be the output of the query?
a) Amit
b) Error
c) 69
d) Null
Answer: b
Explanation: ‘name’ column does not exist
2) As a Data Engineer, you need to design an Azure Synapse Analytics dedicated SQL Pool which can meet the following goal:
- Return student records from a given point in time,
- Maintain current student information
How should you model the student data?
a) View
b) Temporal table
c) Slowly Changing Dimension (SCD) Type 2
d) SCD Type 7
Answer: c
Explanation: Can return information at a certain point-in-time incl. historical data
3) An Azure Data Factory pipeline has the following activities:
- Copy,
- Wrangling data flow,
- Jar,
- Notebooks
Which TWO Azure services should you use to debug the activities?
a) Computer Vision
b) Data Factory
c) Azure Sentinel
d) Azure Databricks
Answer: b,d
Explanation: Computer Vision has to do with AI and Azure Sentinel is a security configuration feature. Therefore, the logical answer(s) would be b and d.
4) A company needs to design an Azure Data Lake Storage solution which will include geo-zone-redundant storage (ZRS) for high availability.
What should you include in the monitoring solution for replication delays which can affect the recovery point objective (RPO)?
a) 4xx: Server error
b) Last sync time
c) Principle of least privilege
d) ARM template
Answer: b
Explanation: Options a, c, and d have nothing to do with storage redundancy or RPO…
5) An automobile company uses an Azure IoT Hub for communication with the IoT devices. What solution should you recommend if you want to monitor the devices in real-time?
a) Azure Data Factory using Visual Studio
b) Azure Stream Analytics job
c) Storage Account using Azure Powershell
d) Azure virtual machine using Azure Portal
Answer: b
Explanation: None of the other options have to do with IoT devices and/or monitoring in real-time.
6) A table will track the values of dimension attributes over the course of time and retain the history of the data by adding new rows as the data changes. Which Slowly Changing Dimension (SCD) type should you use?
a) Type -1
b) Type 1
c) Type 2
d) Type 3
Answer: c
7) A company needs to perform batch processing in Azure Databricks once per day. Which type of databricks cluster should you use?
a) Standard
b) Interactive
c) Automated
d) Manual
Answer: c
Explanation: Standard and Interactive don’t deal with batch processing. ‘Manual’ databricks cluster doesn’t exist.
8) A company is building streaming solutions in Azure Databricks. The solution needs to count events in 5 minute intervals and only report on events which arrive during the interval which will be sent to a Delta Lake table as an output. Which output mode should you use?
a) Complete
b) Partial
c) Append
d) Update
Answer: c
Explanation: Complete and Partial are not output modes. Update only deals with rows that have been changed since the last trigger.
9) A company has an Azure Data Lake Storage Gen2 account called CGAmit which is protected by virtual networks. You need to design an SQL pool in Azure Synapse which will use CGAmit as the source. What should you use to authenticate to CGAmit?
a) Azure Lock
b) Shared Access Signature (SAS)
c) Active Directory Federation Services (ADFS)
d) Managed Identity
Answer: d
Explanation:
Azure Lock deals with accidental deletion of resources.
SAS deals with providing secure delegated access to resources in the storage account. ADFS deals with SSO between internet-facing applications.
10) You need to recommend a solution when designing a database for an Azure Synapse Analytics dedicated SQL pool for transaction fraud which can meet the following requirements:
- Users should not be able to access the actual food card numbers
- Users should be able to use food cards as a feature in the models
What should you suggest?
a) Row-level-security (RLS)
b) Azure Active-Directory Pass-Through authentication
c) Transparent Data Encryption (TDE)
d) Column-level security
Answer: d
Explanation:
RLS is meant for restricting rows.
AADPT authentication is meant for authentication and not relevant here.
TDE encrypts the data but it also needs to be decrypted
11) You need to suggest which format to store the data in Azure Data Lake Storage Gen2 to support the reports. The solution should minimize read times.
- Read two columns from a file which contains 69 columns:
a) Parquet
b) TSV
c) AVRO
- Query one record based on timestamp:
a) Parquet
b) TSV
c) AVRO
Answer: a, c
12) As a data engineer, you need to aggregate data which originates in Kafka and is output to Azure Data Lake Storage Gen2. The testing team needs to implement the stream processing solution using Java.
Which service should you suggest to process the streaming data?
a) Azure Databricks
b) Azure Stream Analytics
c) Azure Sentinel
d) Azure Event Hub
Answer: a
Explanation:
Azure Sentinel and Azure Event Hub don’t processes streaming data. Further, Stream analytics doesn’t support Java (it uses SQL and JavaScript) and is therefore incorrect.
13) A production team needs a solution which can stream data to Azure Stream Analytics. The solution will be having reference data as well as streaming data. Which TWO input types should you use for reference data?
a) Azure DocumentDB
b) Azure Blob Storage
c) Azure Event Hub
d) Azure SQL Database
Answer: b, d
Explanation:
DocumentDB doesn’t support streaming data.
Azure Event Hub can store streaming data but incurs a higher cost than what we need.
14) You need to ensure that data in the Azure Synapse Analytics dedicated SQL pool is encrypted at rest. The solution should NOT modify applications which query the data. What should you implement?
a) Enable Transparent Data Encryption (TDE)
b) Upgrade to Premium P2 license
c) Create Azure functions
d) Use customer managed keys
Answer: a
Explanation:
Nothing in the question mentions licensing.
Azure Functions has nothing to do with encryption.
Customer managed keys are configured at the workspace level (deals with double-encryption).
15) As a data engineer, you need to suggest an Azure Databricks cluster configuration which can meet the following requirements:
- Minimize cost,
- Reduce query latency,
- Maximize the number of users that can execute queries on cluster simultaneously
Which cluster type should you suggest?
a) High concurrency cluster with auto termination
b) High concurrency cluster with autoscaling
c) Standard cluster with auto termination
d) Standard cluster with autoscaling
Answer: b
Explanation:
Standard cluster cannot share multiple tasks (such as autoscaling/termination)
High concurrency clusters cannot be terminated even if we use auto termination.
16) A company needs to trigger an Azure Data Factory pipeline as soon as a file arrives in an Azure Data Lake Storage Gen2 container. Which resource should you use?
a) Microsoft.EventGrid
b) Microsoft.EventHub
c) Microsoft.IoT
d) Microsoft.CosmosDB
Answer: a
Explanation:
The question doesn’t deal with real-time data therefore IoT/CosmosDB are incorrect.
EventHub deals with telemetry data.
EventGrid is natively integrated with Synapse/DF pipelines
17) As a data engineer, you need to make sure that you can audit access to Personally Identifiable Information (PII) while designing an Azure Synapse Analytics dedicated SQL pool. What should you include?
a) RLS
b) Column-level security
c) Security baseline
d) Sensitivity classifications
Answer: d
Explanation:
RLS is meant for restricting rows,
Column-level security is used to create a symmetric key to encrypt the data
Security-baseline provides guidance for database-level security recommendations. Nothing in the question is related to Azure SQL database.
18) You need to design a date dimension table in an Azure Synapse Analytics dedicated SQL pool. As per the business requirement, the date dimension table will be used by all fact tables. Which distribution type should you recommend to minimize data movement?
a) Hash
b) Asterisk
c) Replicate
d) Round robin
Answer: c
Explanation:
For FACT tables, Hash distribution is used.
For DIMENSION tables REPLICATE is used.
For STAGING tables, ROUND ROBIN is used.
There is no ASTERISK distribution type in Azure.
19) As a data engineer, you need to create a new notebook in Azure Databricks which will support Python as the primary language and should also support R and Scala. Which switch should you use to switch between the different languages?
a) %
b) #
c) @{}
d) @[]
Answer: a
20) A company has an Azure Synapse Analytics dedicated SQL pool which contains a huge fact table. The table contains 47 columns and 4.7 BN rows and is a heap. On average, queries against the table aggregate values from approximately 69 million rows and return only two columns. You notice that queries against the fact table are extremely slow. Which type of index should you add to provide the fastest query times?
a) Non-clustered column store
b) Clustered index
c) Semi-clustered index
d) Clustered column store
Answer: d
Explanation:
Non-clustered column store doesn’t exist in Synapse Analytics
Clustered index is best for tables with less than 60 million rows considering performance
Semi-clustered index doesn’t exist in Synapse Analytics
Clustered column store is usually the best choice for large heap tables.
21) An e-commerce company needs to make sure that an Azure Data Lake Storage Gen2 container is available for read workloads in a secondary region if an outage happens in the primary region. Which type of redundancy should you recommend so that your solution minimizes costs?
a) Geo-Zone-Redundant-Storage (G-ZRS)
b) Geo-Redundant-Storage (GRS)
c) Locally-Redundant-Storage (LRS)
d) Read-Access-Geo-Redundant-Storage (RA-GRS)
Answer: d
Explanation:
GRS doesn’t initiate automatic failover, and hence doesn’t meet the requirements.
LRS provides redundancy in a single region only.
22) As a data engineer, you need to configure an Azure Databricks workspace which is currently in the Standard pricing tier to support autoscaling all-purpose clusters. The solution should meet the following requirements:
- Reduce time taken to scale the number of workers while minimizing costs
- Automatically scale down workers when the cluster is underutilized for five minutes
What should be your first step?
a) Upgrade Azure Databricks workspace to Premium pricing tier
b) Create logic apps for the workspace
c) Enable a log analytics workspace
d) Create a storage account
Answer: a
23) A company uses Azure Stream Analytics to accept data from Azure Event Hubs and to output the data to an Azure Blob Storage account. As a data engineer, you need to output the count of records received from the last 7 minutes, every minute. Which window function should you use?
a) Sliding
b) Tumbling
c) Hopping
d) Snapshot
Answer: c
Explanation:
24) An Azure Data Factory pipeline needs to meet the following requirements:
- Support backfilling existing data in the source table
- Automatically retry execution if the pipeline fails due to throtling limits or concurrency
Which type of trigger should you recommend?
a) Schedule
b) Tumbling window
c) Hopping
d) Snapshot
Answer: b
Explanation:
Hopping/Snapshot doesn’t support retry executions and dealing with concurrency issues.
Schedule could be an option but Tumbling Window is better for setting policies.
25) As a data engineer, you need to design an analytical solution which will use Python functions for near real-time data from Azure Event Hubs. Which solution should you recommend to perform statistical analysis to minimize latency?
a) Azure Databricks
b) Azure Stream Analytics
c) Azure Sentinel
d) Azure Event Hub
Answer: a
Explanation:
Sentinel has to do with security, and Event Hub doesn’t deal with streaming data.
Azure Stream Analytics doesn’t support Python.
26) You need to analyze Azure Data Factory pipeline failures from the last 69 days. What should you use?
a) Acitivity log blade
b) Resource health blade
c) Azure Storage Account
d) Azure Monitor
Answer: d
27) You need to make sure that the data in an Azure Data Lake Storage Gen2 storage account will remain available if a data center fails in the primary Azure region. Which replication type should you use for the storage account to minimize costs?
a) Locally-Redundant-Storage (LRS)
b) Zone-Redundant-Storage (ZRS)
c) Geo-Redundant-Storage (GRS)
d) Geo-Zone-Redundant-Storage (GZRS)
Answer: b
Explanation:
GRS/GZRS would also work but will not minimize the costs. Therefore, ZRS is the better option.
28) A company needs to design an Azure Data Factory Pipeline which will include mapping data flow. As per the business requirement, you need to transform JSON-formatted data into a tabular dataset.
Which transformation method should you use in the mapping flow so that the dataset only has one row for each item in the array?
a) Flatten
b) Broaden
c) Modify row
d) Pivot
Answer: a
Explanation:
Broaden/Modify row are not transformation types in ADF.
Pivot doesn’t handle arrays.
29) You need to use a streaming data solution which uses Azure Databricks. The solution should meet the following requirements with respect to output data which contains e-book sales details:
- E-book sales transactions won’t be updated. Only new rows will be added to adjust a sale.
- You are required to suggest an output mode for the dataset which will be processed by using Structured Streaming which reduces duplicate data.
What should you suggest?
a) Append
b) Complete
c) Change
d) Update
Answer: d
Explanation:
Append won’t work as we need to reduce duplicate data.
Complete replaces the entire table with one complete batch.
Change is not an output mode.
30) While monitoring an Azure Stream Analytics job, you notice that the backlogged input events count has been 17 for the last hour. What should you do to reduce the backlogged input events count?
a) Decrease streaming units for the job
b) Delete the job
c) Associate a storage account for the job
d) Increase streaming units for the job
Answer: d
Explanation:
Decreasing streaming units for the job will increase the backlog
Don’t delete the job LOL
No need of storage accounts as the question has nothing to do with storing data.
31) As a data engineer, you need to design the folder structure for Azure Data Lake Storage Gen2. The data should be secured by ‘FocusArea’. Frequent queries will include data from the current year or current month.
Which folder structure should you suggest for minimal delay in queries and simplified folder security?
a) /FocusArea/{DataSource}/{DD}/{MM}/{YYYY}/{FileData}{YYYY}{MM}{DD}.xls
b) {DD}/{MM}/{YYYY}/FocusArea/{DataSource}/{FileData}/{YYYY}{MM}{DD}.xls
c) {YYYY}/{MM}/{DD}/FocusArea/{DataSource}/{FileData}/{YYYY}{MM}{DD}.xls
d) /FocusArea/{DataSource}/{YYYY}/{MM}/{DD}/{FileData}{YYYY}{MM}{DD}.xls
Answer: d
Explanation:
Data needs to be secured by FocusArea, b/c are incorrect because the structure doesn’t start with the FocusArea.
Option a’s folder strucutre starts with ‘DD’ for the date, which indicates ‘day’. We want either the current year or current month.
32) A company has a data lake which is accessible only via an Azure virtual network. You are building an SQL pool in Azure Synapse which will use data from the data lake and is planned to load data to the SQL pool every hour. You need to make sure that the SQL pool can load the data from the data lake. Which TWO actions should you perform?
a) Create a service principal
b) Create a managed identity
c) Add an Azure Active Directory Federation Servcice (ADFS) account
d) Configure managed identity as credentials for the data loading process
Answer: b, d
Explanation:
Whenever virtual networks are mentioned, managed identity is the best option.
33) You need to suggest a Stream Analytics data output format to make sure that queries from Databricks and PolyBase against the files encounter with less errors. The solution should make sure that the files can be queried fast and that the data type information is kept intact. What should you suggest?
a) Parquet
b) TSV
c) JSON
d) AVRO
Answer: a
Explanation: The solution should maintain the metadata itself (data type information needs to be kept intact)
34) You need to configure an Azure Databricks cluster to automatically connect to Azure Data Lake Storage Gen2 with the help of Azure AD Integration. How should you configure the cluster?
Advanced option to be enabled:
a) Premium
b) Standard
c) Azure Data Lake Storage Credential Pass-through
Tier:
a) Premium
b) Standard
c) Azure Data Lake Storage Credential Pass-through
Answer: c, a
Explanation:
For the Tier -
35) You are required to copy blob data from an Azure Storage account to the data warehouse with the help of Azure Data Factory. The solution should meet the following requirements:
- Make sure that the data remains in the US Central region at all times,
Which type of integration runtime should you use?
a) Data sovereignty runtime
b) Azure-SSIS
c) Self-hosted
d) Azure Integration runtime
Answer: d
Explanation:
Option a is not an IR type
Options b and c can’t guarantee that the requirement is met.
36) You need to design an Azure Synapse solution which can provide a query interface for the data stored in an Azure Storage account which is only accessible from a virtual network. Which authentication mechanism should you recommend to ensure that the solution can access the source data?
a) Managed Identity
b) Bastion Host
c) Shared Access Signatures (SAS)
d) Azure Active Directory Authentication
Answer: a
Explanation: Managed Identity is required when your storage is attached to a virtual network.
37) A company has 7 Azure Data Factory pipelines. You need to label each pipeline with the primary purpose of either extract, transform, or load. The labels should be available for grouping and filtering when using monitoring experience in Data Factory. What should be added to each pipeline?
a) Caption
b) Subtitles
c) Annotation
d) Tags
Answer: c
Explanation: a,b are not real labels. Tags are used with key-value pairs.
38) An e-commerce company has an Azure Data Factory component named CGA which contains a linked service. There is an Azure Key Vault which contains an encryption key named ‘TestKey’. What should be your first step to encrypt CGA using the encryption key ‘TestKey’?
a) Build a self-hosted integration runtime
b) Create a new key vault
c) Create a managed identity
d) Remove linked service from CGA
Answer: d
Explanation:
39) You need to copy files and folders from storage accounts Storage7 to Storage8 using Data Factory copy activity. The solution should meet the following requirements:
- The original folder structure should be maintained
- No transformations should be performed
How should you setup the copy activity?
Dataset source type:
a) Binary
b) Avro
c) Preserve hierarchy
Copy activity:
a) Binary
b) Avro
c) Preserver hierarchy
Answer: a,c
40) You want to prevent the development team users from seeing the full email addresses in the email column having an SQL pool in Azure Synapse. The users should be able to see the values in the format as ZZ@ZZZZ.com instead. Which TWO options can meet this requirement?
a) Set a mask on the email column from Azure Portal
b) Select mask row from Azure Portal
c) Set an email mask on the email column MS SQL Server Management Studio
d) Create a key vault for the email column
Answer: a,c
Explanation: Option B is not a function we can use in Azure Portal. Option D is related to encryption and cannot be applied to column level masking.
41) An Azure Data Lake Storage Gen2 container contains TSV files. The file size ranges from 7 KB to 3 GB. What should you do to ensure that the files are stored in the container are optimized for batch processing?
a) Delete the files
b) Merge the files
c) Compress the files
d) Convert files to Parquet
Answer: b
Explanation: For better performance in batch processing it is recommended to merge files into larger files (256 MB - 100 GB range)
42) A company has an Azure Synapse Analytics Apache Spark Pool called TestPool. You need to load JSON files from an Azure Data Lake Storage Gen2 container into the tables in TestPool. You need to load the files into the tables where structure and data types vary by file. What should you do so that the solution maintains the source data types?
a) Load data using PySpark
b) Load data using a openrowset T-SQL command in the Synapse Analytics serverless SQL Pool
c) Load data using Python
d) Load data using Sentinel
Answer: a
Explanation: Sentinel is a security services application and not applicable here. Python is a programming language and not a compute engine.
43) An e-commerce company has an Azure Databricks resource which needs to log actions that relate to changes in compute for the Databricks resource. Which Databricks service should you log?
a) RDP
b) CosmosDB
c) Clusters
d) Workspace
Answer: c
Explanation:
44) You need to configure a batch dataset in Parquet format where data files will be generated using Azure Data Factory and stored in Azure Data Lake Storage Gen2. You are required to reduce storage costs for the files which will be consumed by an Azure Synapse analytics serverless SQL pool. What should be your first step?
a) Configure snappy compression for files
b) Store data as AVRO files
c) Create an external table
d) Use archive tier
Answer: a
Explanation:
45) A company has a partitioned table in an Azure Synapse Analytics dedicated SQL pool. You need to create queries to maximize the advantages of partition elimination. What should you include in your T-SQL queries?
a) WHERE
b) ORDER BY
c) SUM
d) AVG
Answer: a
Explanation: Operations b,c, and d are not related to partitioning.
46) A company is planning on migrating data from the database to a star schema in a Synapse Analytics dedicated SQL pool. Currently, SQL server database uses a third normal form schema. You need to design dimension tables while optimizing read operations. What should be included in the solution?
Data transformation for dimension tables by:
a) Denormalize to 2NF
b) New Identity columns
c) Normalizing to fifth normal form
Primary key column in the dimension tables:
a) Denormalize to 2NF
b) New Identity columns
c) Normalizing to fifth normal form
Answer: a, b
47) A company uses Azure Event Hub to ingest data and Azure Stream Analytics cloud job to analyze the data for a real-time data analysis solution. Currently, the cloud job is configured to use 127 Streaming Units. Which TWO actions should you perform to optimize performance for Azure Stream Analytics jobs?
a) Decrease stream units
b) Partition data input using query parallelization
c) Implement computer vision
d) Partition data output using query parallelization
Answer: b, d
Explanation: Best in this scenario is to partition both input and output streams to the same number of partitions.
48) An automobile company uses Azure IoT Hub to communicate with various IoT devices. What solution should you design so that the company is able to monitor the devices in real-time?
a) Data Factory virtual machine using Azure Portal
b) Data Factory virtual machine using CLI
c) Stream Analytics job using Azure Portal
d) Data Factory virtual machine using Powershell
Answer: c
Explanation: CLI/Powershell won’t work here. For real-time data Stream Analytics is the better option than Data Factory.
49) You have created an external table named ExtTable in Azure Data Explorer. Now, a database user needs to run a KQL (Kusto Query Language) query on this external table. Which of the following functions should be used to refer to this table?
a) external_table()
b) access_table()
c) ext_table()
d) None of the above
Answer: a
Explanation:
50) Your company wants you to ingest data onto cloud data platforms in Azure. Which data processing framework will you use?
a) OLTP
b) ETL
c) ELT
Answer: c
Explanation: ELT is a typical process for ingesting data from an on-premises database into Azure cloud.
51) You have an Azure Synapse workspace named MyWorkspace that contains an Apache Spark database named mytestdb. You run the following command in an Azure Synapse Analytics Spark pool in MyWorkspace:
CREATE TABLE mytestdb.myParquetTable (
EmployeeId int,
EmployeeName string,
EmployeeStartDate date
)
Using Parquet, you then use Spark to insert a row into mytestdb.myParquetTable. The row contains the following data:
- EmployeeName: Peter
- EmployeeId: 1001
- EmployeeStartDate: 28-July-2022
One minute later, you execute the following query from a serverless SQL pool in MyWorkspace:
SELECT EmployeeId FROM mytestdb.dbo.myParquetTable WHERE name = “Peter”;
What will be returned by the query?
a) 24
b) An error
c) Null
Answer: b
Explanation: We reference ‘name’ instead of ‘EmployeeName’ and hence an error will be produced.
52) In structured data, you define the data type at query time.
a) True
b) False
Answer: b
Explanation: Data is defined at query time in unstructured data.
53) When you create a temporal table in Azure SQL Database, it automatically creates a history table in the same database to capture historical records. Which of the following statements is true about temporal tables and history tables (select all options that apply):
a) A temporal table must have 1 primary key
b) To create a temporal table, system versioning must be set to On
c) To create a temporal table, system versioning must be set to Off
d) It is mandatory to mention the name of the history table when you create the temporal table
e) If you don’t specify the name for the history table, the default naming convention is used for the history table
f) You can specify the table constraints for the history table
Answer: a, b, e
Explanation:
54) To create Data Factory instances, the user account that you use to sign into Azure must be a member of (select all options that apply):
a) Contributor
b) Owner Role
c) Administrator of the Azure subscription
d) Write
Answer: a,b,c
55) You need to design an application that can accept market information as an input. Using the machine-learning classification model, the application will classify the the input data into two categories:
- Car models that sell more with buyers between 18 - 40 years
- Car models that sell more with buyers above 40
What would you recommend to train the model?
a) Power BI Models
b) Text Analytics API
c) Computer Vision API
d) Apache Spark MLlib
Answer: d
Explanation: Machine Learning Library
56) You are designing an Azure Stream Analytics solution that will analyze Twitter data. You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.
Solution: You use a session window that uses a timeout size of 10 seconds.
Does this meet the goal?
a) Yes
b) No
Answer: b
57) You are designing an Azure Stream Analytics solution that will analyze Twitter data. You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.
Solution: You use a sliding, and you set the window size to 10 seconds.
Does this meet the goal?
a) Yes
b) No
Answer: b
58) You are designing an Azure Stream Analytics solution that will analyze Twitter data. You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.
Solution: You use a tumbling window, and you set the window size to 10 seconds.
Does this meet the goal?
a) Yes
b) No
Answer: a
59) What are the key components of Azure Data Factory? Select all that apply:
a) Database
b) Connection String
c) Pipelines
d) Activities
e) Datasets
f) Linked Services
g) Data Flows
h) Integration Runtimes
Answer: c, d, e, f, g, h
60) Which of the following are valid trigger types of Azure Data Factory? Select all that apply:
a) Monthly Trigger,
b) Scheduled Trigger,
c) Overlap Trigger,
d) Tumbling window trigger,
e) Event-based trigger
Answer: b, d, e
61) Duplicating customer content for redundancy and meeting service-level-agreements (SLAs) is Azure Maintainability.
a) Yes
b) No
Answer: b
Explanation: This is Azure High Availability
62) You have an Azure Synapse Analytics dedicated SQL pool that contains a table named contacts. Contacts contains a column named Phone. You need to ensure that users in a specific role only see the last four digits of a phone number when querying the Phone column. What should you include in the solution?
a) Column encryption
b) Dynamic data masking
c) A default value
d) Table partitions
e) Row-level-security (RLS)
Answer: b
Explanation: Frequently used for masking credit card numbers, emails etc…
63) A company has a data lake which is accessible only via an Azure virtual network. You are building an SQL pool in Azure Synapse which will use data from the data lake and is planned to load data into the SQL pool every hour. You need to make sure that the SQL can load the data from the data lake. Which TWO actions should you perform?
a) Create a service principal
b) Create a managed identity
c) Add an Azure Active Directory Federation Services (ADFS) account
d) Configure managed identity as credentials for the data loading process
Answer: b, d
64) Which role works with Azure Cognitive Services, Cognitive Search, and the Bot Framework?
a) A data engineer,
b) A data scientist
c) An AI engineer
Answer: c