HomeMicrosoft Exams PracticeDP-203 Practice Test | Azure Data Engineer Certification DP-203 Practice Test | Azure Data Engineer Certification Welcome to your DP-203 Practice Test Please, provide your Full Name and Email to get started! Please, enter your Full Name Please, enter your Email 1. To query two different Azure Cosmos DB for NoSQL containers by using an Azure Synapse serverless SQL pool, you must use the OPENROWSET function twice or more. True False None 2. You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytic requirements. Which three Transact-SQL DDL commands should you run in sequence? CREATE DATABASE SCOPED CREDINITAL CREATE EXTERNAL TABLE AS SELECT CREATE EXTERNAL DATA SOURCE CREATE EXTERNAL TABLE CREATE EXTERNAL FILE FORMAT 3. You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB. You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics. You need to prepare the files to ensure that the data copies quickly. Solution: You convert the files to compressed delimited text files. No Yes None 4. In an Azure Synapse Studio notebook, cells can only contain code. False True None 5. You are monitoring an Azure Stream Analytics job. The Backlogged Input Events count has been 20 for the last hour. You need to reduce the Backlogged Input Events count. What should you do? Drop late arriving events from the job. Increase the streaming units for the job. Stop the job. Add an Azure Storage account to the job. None 6. Azure Advisor can identify when an Azure Synapse Analytics dedicated SQL pool is missing a table statistic. False True None 7. You have an Azure subscription that contains a logical Microsoft SQL server named Server1. Server1 hosts an Azure Synapse Analytics SQL dedicated pool named Pool1. You need to recommend a Transparent Data Encryption (TDE) solution for Server1 to Maintain the access of client apps to Pool1 in the event of an Azure datacenter outage that affects the availability of the encryption keys.. What should you include in the recommendation? Implement a client app by using Microsoft .Net framework data provider Enable Advanced Data Security in Server1 Create and configure Azure key vaults in two Azure regions None 8. You plan to create a retail store table that will contain the address of each retail store. The table will be approximately 2 MB. Queries for retail store sales will include the retail store addresses. You need to implement the surrogate key for the retail store table. What should you create? A user-defined SEQUENCE object A system-versioned temporal table A table that has a FOREIGN KEY constraint A table that has an IDENTITY property None 9. Apache Spark pools in Azure Synapse Analytics can be configured to pause automatically after a predefined idle time. False True None 10. You have an Azure Synapse Analytics dedicated SQL pool. You need to ensure that data in the pool is encrypted at rest. The solution must NOT require modifying applications that query the data. What should you do? Use a customer-managed key to enable double encryption for the Azure Synapse workspace. Enable encryption at rest for the Azure Data Lake Storage Gen2 account. Create an Azure key vault in the Azure subscription grant access to the pool. Enable Transparent Data Encryption (TDE) for the pool. None 11. Azure Advisor recommendations for Azure Synapse Analytics are updated once a week. False True None 12. You have an Azure SQL database that has masked columns. You need to identify when a user attempts to infer data from the masked columns. What should you use? Custom masking rules Auditing Transparent Data Encryption (TDE) Azure Advanced Threat Protection (ATP) None 13. An Azure Synapse Studio notebook can load data from an Azure Cosmos DB for NoSQL database. False True None 14. A company has a real-time data analysis solution that is hosted on Microsoft Azure. The solution uses Azure Event Hub to ingest data and an Azure Stream Analytics cloud job to analyze the data. The cloud job is configured to use 120 Streaming Units (SU). You need to optimize performance for the Azure Stream Analytics job. Which two actions should you perform? Implement query parallelization by partitioning the data input Implement query parallelization by partitioning the data output Implement Azure Stream Analytics user-defined functions (UDF) Scale the SU count for the job up Implement event ordering Scale the SU count for the job down 15. You have a table in an Azure Synapse Analytics dedicated SQL pool. The table was created by using the following Transact-SQL statement. You need to alter the table to meet the following requirements: Ensure that users can identify the current manager of employees. Provide fast lookup of the managers attributes such as name and job title. Support creating an employee reporting hierarchy for your entire company. Which column should you add to the table? [ManagerEmployeeKey] [smallint] NULL [ManagerEmployeeKey] [int] NULL [ManagerName] [varchar](250) NULL [ManagerEmployeeID] [smallint] NULL None 16. A company purchases IoT devices to monitor manufacturing machinery. The company uses an IoT appliance to communicate with the IoT devices. The company must be able to monitor the devices in real-time. You need to design the solution. What should you recommend? Azure Data Factory instance using Azure PowerShell Azure Stream Analytics cloud job using Azure Portal Azure Data Factory instance using Microsoft Visual Studio Azure Analysis Services using Microsoft Visual Studio None 17. You have an Azure Data Lake Storage Gen2 account named adls2 that is protected by a virtual network. You are designing a SQL pool in Azure Synapse that will use adls2 as a source. What should you use to authenticate to adls2? Shared access signature Shared key Azure Active Directory (Azure AD) user Managed identity None 18. You are designing an Azure Data Factory solution that will download up to 5 TB of data from several REST APIs. The solution must meet the following Analysis requirements: 1. Ensure that the data can be loaded in parallel. 2. Ensure that users and applications can query the data without requiring an additional compute engine. What should you include in the solution to meet the Analysis requirements? Azure Event Hub Azure SQL Analytic Azure Synapse Analytic Azure Blob Storage None 19. In T-SQL, a CROSS APPLY operator can be used with the OPENJSON function to expand JSON arrays stored in individual fields and join the fields to the parent rows. False True None 20. In an Azure Synapse Analytics dedicated SQL pool, the service administrator account always uses a larger dynamic resource class. True False None 21. You have a data model that you plan to implement in a data warehouse in Azure Synapse Analytics as shown in the following exhibit. All the dimension tables will be less than 2 GB after compression, and the fact table will be approximately 6 TB. Which type of table should you use for Fact_DailyBooking? Heap Round-Robin Hash-distributed Replicated None 22. Cells in an Azure Synapse Studio notebook can be reordered without copying and pasting the contents. False True None 23. A company runs Microsoft SQL Server in an on-premises virtual machine (VM). You must migrate the database to Azure SQL Database. You synchronize users from Active Directory to Azure Active Directory (Azure AD). You need to configure Azure SQL Database to use an Azure AD user as administrator. What should you configure? For each Azure SQL Database server, set the Access Control to administrator For each Azure SQL Database server, set the Active Directory to administrator For each Azure SQL Database, set the Access Control to administrator For each Azure SQL Database server, set the Access Control to administrator. None 24. You are planning the deployment of Azure Data Lake Storage Gen2. You have a report that will access the data lake and queries a single record based on a timestamp. Which format to store the data in the data lake to support the report and must minimize read times? CSV Avro TXT Parquet None 25. You are designing an Azure Stream Analytics job to process incoming events from sensors in retail environments. You need to process the events to produce a running average of shopper counts during the previous 15 minutes, calculated at five-minute intervals. Which type of window should you use? Session Window Sliding Window Tumbling Window Hopping Window None 26. You need to output files from Azure Data Factory. Which file format should you use for JSON with TimeStamp GIZP CSV Avro Parquet None 27. You are designing a monitoring solution for a fleet of 500 vehicles. Each vehicle has a GPS tracking device that sends data to an Azure event hub once per minute. You have a CSV file in an Azure Data Lake Storage Gen2 container. The file maintains the expected geographical area in which each vehicle should be. You need to ensure that when a GPS position is outside the expected area, a message is added to another event hub for processing within 30 seconds. The solution must minimize cost. What should you include in the solution for Analysis Type? Polygon with overlap Lagged record comparison Event Patten Matching Point with polygon None 28. You plan to build a pipeline in Azure Data Factory that will contain two activities named Activity A and Activity B. You need to ensure that Activity A executes first, and Activity B executes second. Activity B must always execute, even if Activity A fails. Which dependency condition should you choose for the dependency between Activity A and Activity B? Successes Failed Completed None 29. You are designing a fact table named FactPurchase in an Azure Synapse Analytics dedicated SQL pool. The table contains purchases from suppliers for a retail store. FactPurchase will contain the following columns. FactPurchase will have 1 million rows of data added daily and will contain three years of data.Transact-SQL queries similar to the following query will be executed daily. SELECT SupplierKey, StockItemKey, COUNT(*) FROM FactPurchase WHERE DateKey >= 20210101 AND DateKey <= 20210131 GROUP By SupplierKey, StockItemKey,ISOrderFinalized Which table distribution will minimize query times? Replicated Hash-distributed on ISOrderFinalized Hash-distributed on PurchaseKey Round Robin Hash-distributed on DateKey None 30. The following code segment is used to create an Azure Databricks cluster. Does the Databricks cluster support multiple concurrent users? Yes No None 31. You need to design an Azure Synapse Analytics dedicated SQL pool that can return an employee record from a given point in time, maintains the latest employee information, and minimizes query complexity. How should you model the employee data? SQL graph table Temporal table Type 2 slowly changing dimension (SCD) table Type 1 slowly changing dimension (SCD) table None 32. You have an Azure Storage account that generates 200,000 new files daily. The file names have a format of {YYYY}/{MM}/{DD}/{HH}/{CustomerID}.csv. You need to design an Azure Data Factory solution that will load new data from the storage account to an Azure Data Lake once hourly. The solution must minimize load times and costs. How should you configure the solution for Load Methodology? Load files as they arrive Full Load Incremental Load None 33. Azure Synapse Studio notebooks support Jupyter magic commands for line magics and cell magics. True False None 34. You are building an Azure Stream Analytics job to identify how much time a user spends interacting with a feature on a webpage. The job receives events based on user actions on the webpage. Each row of data represents an event. Each event has a type of either 'start' or 'end'. You need to calculate the duration between start and end events. How should you complete the query? SELECT [user], feature, DATEDIFF( second, TOPONE(Time) OVER (PARTITION BY [user], feature LIMIT DURATION(hour, 1) WHEN Event = 'start'), Time) as duration FROM input TIMESTAMP BY Time WHERE Event = 'end' SELECT [user], feature, DATEDIFF( second, LAST(Time) OVER (PARTITION BY [user], feature LIMIT DURATION(hour, 1) WHEN Event = 'start'), Time) as duration FROM input TIMESTAMP BY Time WHERE Event = 'end' SELECT [user], feature, DATEADD( second, LAST(Time) OVER (PARTITION BY [user], feature LIMIT DURATION(hour, 1) WHEN Event = 'start'), Time) as duration FROM input TIMESTAMP BY Time WHERE Event = 'end' SELECT [user], feature, DATEPART( second, ISFIRST(Time) OVER (PARTITION BY [user], feature LIMIT DURATION(hour, 1) WHEN Event = 'start'), Time) as duration FROM input TIMESTAMP BY Time WHERE Event = 'end' None 35. A dedicated data load account can be assigned to a workload group by using a workload classification. False True None 36. You have a data model that you plan to implement in a data warehouse in Azure Synapse Analytics as shown in the following exhibit. All the dimension tables will be less than 2 GB after compression, and the fact table will be approximately 6 TB. Which type of table should you use for Dim_Customer? Replicated Hash-distributed Heap Round-Robin None 37. Firewall rules in Azure Synapse allow you create rules that specify IP ranges that are denied access to an Azure Synapse workspace. False True None 38. Beneath a cell, you can see the step-by-step execution status of a cell for which execution is in progress. True False None 39. When using an Azure Synapse Analytics serverless SQL pool to query an Azure Cosmos DB for NoSQL container, all results are returned as strings. False True None 40. An Azure Storage account provides two access keys. False True None 41. The following code segment is used to create an Azure Databricks cluster. Does Databricks cluster support the creation of Delta Lake Table? No Yes None 42. You have an Azure Stream Analytics query. The query returns a result set that contains 10,000 distinct values for a column named clusterID. You monitor the Stream Analytics job and discover high latency. You need to reduce the latency. Which two actions should you perform? Convert the query to a reference query. Scale out the query by using PARTITION BY. Add a temporal analytic function. Increase the number of streaming units. 43. You are designing an app that will provide a data cleaning and supplementing service for customers. The app will use Azure Data Factory to run a daily process to read and write data from Azure Storage blob containers. You need to recommend an access mechanism for the customers to grant the app access to their data. The solution must Provide access for a period of three months, Restrict the app's access to specific containers, Minimize administrative effort, and Minimize changes to the existing access controls of the customer's Azure Storage accounts. What should you recommend? Anonymous public read access Shared access signature Managed identity Shared key None 44. You are designing a real-time dashboard solution that will visualize streaming data from remote sensors that connect to the internet. The streaming data must be aggregated to show the average value of each 10-second interval. The data will be discarded after being displayed in the dashboard. The solution will use Azure Stream Analytics and must minimize latency from an Azure Event hub to the dashboard, minimize the required storage, and minimize development effort. What should you include in your stream solution as Input Type? Azure SQL Azure Cosmos DB Azure Event Hub Azure Stream Analytic None 45. You have a table named SalesFact in an enterprise data warehouse in Azure Synapse Analytics. SalesFact contains sales data from the past 36 months and has the following characteristics: Is partitioned by month, Contains one billion rows, and Has clustered columnstore indexes At the beginning of each month, you need to remove data from SalesFact that is older than 36 months as quickly as possible. Which three actions should you perform in sequence in a stored procedure? Truncate the partition containing the stale data Copy the data to a new table using Create TABLE AS SELECT Switch the partition containing the stale data from SalesFact to SalesFact_Work Create an empty table called SalesFact_Work that has the same schema as SalesFact Drop the SalesFact_Work table 46. The following code segment is used to create an Azure Databricks cluster. Does Databricks cluster minimize costs when running scheduled jobs that execute notebooks? No Yes None 47. In Azure Synapse Studio, you can use a T-SQL script to read data from an Azure Blob Storage container and insert the data into a table within a SQL pool. True False None 48. You have an Azure Stream Analytics job that receives clickstream data from an Azure event hub. You need to define a query in the Stream Analytics job. The query must count the number of clicks within each 10-second window based on the country of a visitor, and ensure that each click is NOT counted more than once. How should you define the Query? SELECT Country, Count(*) AS Count FROM ClickStream TIMESTAMP BY CreatedAt GROUP BY Country, SessionWindow(second, 5, 10) SELECT Country, Count(*) AS Count FROM ClickStream TIMESTAMP BY CreatedAt GROUP BY Country, TumblingWindow(second, 10) SELECT Country, Avg(*) AS Average FROM ClickStream TIMESTAMP BY CreatedAt GROUP BY Country, SlidingWindow(second, 10) SELECT Country, Avg(*) AS Average FROM ClickStream TIMESTAMP BY CreatedAt GROUP BY Country, HoppingWindow(second, 10, 2) None 49. You are creating dimensions for a data warehouse in an Azure Synapse Analytics dedicated SQL pool. You create a table by using the Transact-SQL statement shown in the following exhibit. What's the slowly changing dimension type for the [DimProduct] table? Type 2 Type 1 Type 3 Type 6 None 50. A company has a storage account named XYZstore2020. They want to ensure that they can recover a blob object if it was deleted in the last 10 days. Which of the following would they implement for this requirement? Soft Delete Firewalls and virtual networks Access Keys CORS None 51. You have an Azure Storage account that generates 200,000 new files daily. The file names have a format of {YYYY}/{MM}/{DD}/{HH}/{CustomerID}.csv. You need to design an Azure Data Factory solution that will load new data from the storage account to an Azure Data Lake once hourly. The solution must minimize load times and costs. How should you configure the solution for Trigger? Tumbling Window Fixed Schedule New File Hopping Window None 52. You are designing a real-time dashboard solution that will visualize streaming data from remote sensors that connect to the internet. The streaming data must be aggregated to show the average value of each 10-second interval. The data will be discarded after being displayed in the dashboard. The solution will use Azure Stream Analytics and must minimize latency from an Azure Event hub to the dashboard, minimize the required storage, and minimize development effort. What should you include in your stream solution as Output Type? Azure SQL Azure Event Hub Azure Stream Analytic Power BI None 53. You have an Azure subscription that contains the following resources: 1. An Azure Active Directory (Azure AD) tenant that contains a security group named Group1. 2. An Azure Synapse Analytics SQL pool named Pool1 You need to control the access of Group1 to specific columns and rows in a table in Pool1. Which Transact-SQL commands should you use to control access for rows? CREATE SECURITY POLICY CREATE PARTITION FUNCTION GRANT None 54. You have an Azure Data Lake Storage Gen2 account named account1 that stores logs as shown in the following table. You do not expect that the logs will be accessed during the retention periods.You need to recommend a solution for account1 that meets the following requirements: Automatically deletes the logs at the end of each retention period Minimizes storage costs What should you include in the recommendation to delete the logs automatically? Immutable Azure Blob storage time-based retention polices Azure Data Factory Pipeline Azure Blob storage lifecycle management rules None 55. You have an Azure SQL database named Database1 and two Azure event hub named Hub1. Database1 is used to store Driver Name, and License and Hub1 is used to store Ride Route, Ride Distance, Ride duration. You need to implement Azure Stream Analytics to calculate the average fare per mile by driver. How should you configure the Stream Analytics input for Hub1? Stream Reference None 56. You are designing a solution that will copy Parquet files stored in an Azure Blob storage account to an Azure Data Lake Storage Gen2 account. The data will be loaded daily to the data lake and will use a folder structure of {Year}/{Month}/{Day}/. You need to design a daily Azure Data Factory data load to minimize the data transfer between the two accounts. Which two configurations should you include in the design? Delete the source files after they are copied. Delete the files in the destination before loading new data. Filter by the last modified date of the source files. Specify a file naming pattern for the destination. 57. You are planning a solution to aggregate streaming data that originates in Apache Kafka and is output to Azure Data Lake Storage Gen2. The developers who will implement the stream processing solution use Java. Which service should you recommend using to process the streaming data? Azure Databricks Azure Event Hubs Azure Stream Analytics Azure Data Factory None 58. You have a self-hosted integration runtime in Azure Data Factory. The current status of the integration runtime has the following configurations: Status: Running Type: Self-Hosted Version: 4.4.7292.1 Running / Registered Node(s): 1/1 High Availability Enabled: False Linked Count: 0 Queue Length: 0 Average Queue Duration: 0.00s The integration runtime has the following node details: Name: X-M Status: Running Version: 4.4.7292.1 Available Memory: 7697MB CPU Utilization: 6% Network (In/Out): 1.21KBps/0.83KBps Concurrent Jobs (Running/Limit): 2/14 Role: Dispatcher/Worker Credential Status: In Sync If the X-M mode becomes unavailable, all executed pipelines will Switch to another integration runtime Fail until the node comes back online Exceed the CPU limit None 59. You are designing a monitoring solution for a fleet of 500 vehicles. Each vehicle has a GPS tracking device that sends data to an Azure event hub once per minute. You have a CSV file in an Azure Data Lake Storage Gen2 container. The file maintains the expected geographical area in which each vehicle should be. You need to ensure that when a GPS position is outside the expected area, a message is added to another event hub for processing within 30 seconds. The solution must minimize cost. What should you include in the solution as Service? Azure Stream Analytic Azure Synapse Analytics Apache Spark Pool Azure Data Factory Azure Synapse Analytics Serverless SQL Pool None 60. You have an Azure SQL database named Database1 and two Azure event hub named Hub1. Database1 is used to store Driver Name, and License and Hub1 is used to store Ride Route, Ride Distance, Ride duration. You need to implement Azure Stream Analytics to calculate the average fare per mile by driver. How should you configure the Stream Analytics input for Database1? Stream Reference None 61. Dynamic data masking can be used to obfuscate all but the last four digits of a credit card number when data is queried from a column in a table in an Azure Synapse Analytics dedicated SQL pool. True False None 62. You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar(22). You need to implement masking for the Customer_ID field to be The first two prefix characters must be exposed, The last four suffix characters must be exposed, and All other characters must be masked. Which solution should you use? Implement data masking and use Custom Text function mask. Implement data masking and use Credit card function mask. Implement data masking and use Email function mask. Implement data masking and use Password function mask. None 63. You plan to ingest streaming social media data by using Azure Stream Analytics. The data will be stored in files in Azure Data Lake Storage, and then consumed by using Azure Databricks and PolyBase in Azure Synapse Analytics. You need to recommend a Stream Analytics data output format to ensure that the queries from Databricks and PolyBase against the files encounter the fewest possible errors. The solution must ensure that the files can be queried quickly and that the data type information is retained. What should you recommend? JSON Avro Parquet CSV None 64. You are designing a monitoring solution for a fleet of 500 vehicles. Each vehicle has a GPS tracking device that sends data to an Azure event hub once per minute. You have a CSV file in an Azure Data Lake Storage Gen2 container. The file maintains the expected geographical area in which each vehicle should be. You need to ensure that when a GPS position is outside the expected area, a message is added to another event hub for processing within 30 seconds. The solution must minimize cost. What should you include in the solution for Window Type? Tumbling No Window Session Hopping None 65. You have an Azure Data Lake Storage Gen2 account named account1 that stores logs as shown in the following table. You do not expect that the logs will be accessed during the retention periods.You need to recommend a solution for account1 that meets the following requirements: Automatically deletes the logs at the end of each retention period Minimizes storage costs What should you include in the recommendation to minimize storage costs? Store the infrastructure and the application logs in the Cool access tier Store the infrastructure in the Cool access tier and the application logs in the Archive access tier Store the infrastructure and the application logs in the Archive access tier None 66. You need to recommend a file format for a large, curated dataset that will be loaded to Azure Synapse Analytics. The dataset will be loaded on a scheduled basis. The recommendation must minimize the time required to import the data into a dedicated SQL pool. Which file format should you recommend? JSON XML Parquet None 67. You are designing the folder structure for an Azure Data Lake Storage Gen2 container. Users will query data by using a variety of services including Azure Databricks and Azure Synapse Analytics serverless SQL pools. The data will be secured by subject area. Most queries will include data from the current year or current month. Which folder structure should you recommend to support fast queries and simplified folder security? /{DataSource}/{SubjectArea}/{YYYY}/{MM}/{DD}/{FileData}_{YYYY}_{MM}_{DD}.csv /{YYYY}/{MM}/{DD}/{SubjectArea}/{DataSource}/{FileData}_{YYYY}_{MM}_{DD}.csv /{SubjectArea}/{DataSource}/{DD}/{MM}/{YYYY}/{FileData}_{YYYY}_{MM}_{DD}.csv /{SubjectArea}/{DataSource}/{YYYY}/{MM}/{DD}/{FileData}_{YYYY}_{MM}_{DD}.csv None 68. You have an Azure subscription that contains the following resources: 1. An Azure Active Directory (Azure AD) tenant that contains a security group named Group1. 2. An Azure Synapse Analytics SQL pool named Pool1 You need to control the access of Group1 to specific columns and rows in a table in Pool1. Which Transact-SQL commands should you use to control access for columns? GRANT CREATE PARTITION FUNCTION CREATE SECURITY POLICY None 69. The Until activity in Azure Data Factory first checks a condition, and then performs a task if the condition is true. False True None 70. You are the data engineer for your company. An application uses a NoSQL database to store data. The database uses the key-value and wide-column NoSQL database type. Developers need to access data in the database using an API. You need to determine which API to use for the database model and type? Gremlin API MongoDB API Cassandra API Table API 71. You need to output files from Azure Data Factory. Which file format should you use for a Columnar Format TXT CSV Parquet Avro None 72. You are designing an enterprise data warehouse in Azure Synapse Analytics that will store website traffic analytic in a star schema. You plan to have a fact table for website visits. The table will be approximately 5 GB. You need to recommend which distribution type to use for the table? Heap Round Robin distribution Replicated Hash distribution None 73. You have an Azure subscription that contains a logical Microsoft SQL server named Server1. Server1 hosts an Azure Synapse Analytics SQL dedicated pool named Pool1. You need to recommend a Transparent Data Encryption (TDE) solution for Server1 to track the usage of encryption keys. What should you include in the recommendation? All Encrypted Customer-managed keys Platform-managed keys None 74. You are designing a real-time dashboard solution that will visualize streaming data from remote sensors that connect to the internet. The streaming data must be aggregated to show the average value of each 10-second interval. The data will be discarded after being displayed in the dashboard. The solution will use Azure Stream Analytics and must minimize latency from an Azure Event hub to the dashboard, minimize the required storage, and minimize development effort. What should you include in your stream solution to aggregate query? Azure Event Hub Power BI Azure SQL Azure Stream Analytic None 75. You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day. You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times. What should you include in the solution? Include a watermark column. Partition by DateTime fields. Sink to Azure Queue storage. Use a JSON format for physical data storage. None 76. You are developing a solution that will stream to Azure Stream Analytics. The solution will have both streaming data and reference data. Which input type should you use for the reference data? Azure Cosmos DB Azure IoT Hub Azure Blob storage Azure Event Hubs None 77. You manage an enterprise data warehouse in Azure Synapse Analytics. Users report slow performance when they run commonly used queries. Users do not report performance changes for infrequently used queries. You need to monitor resource utilization to determine the source of the performance issues. Which two metrics should you monitor? Local tempdb percentage Cache hit percentage DWU percentage Cache used percentage Data IO percentage CPU percentage 78. You are planning the deployment of Azure Data Lake Storage Gen2. You have a report that will access the data lake and reads three columns from a file that contains 50 columns. Which format to store the data in the data lake to support the report and must minimize read times? CSV Avro Parquet TXT None 79. You have a self-hosted integration runtime in Azure Data Factory. The current status of the integration runtime has the following configurations: Status: Running Type: Self-Hosted Version: 4.4.7292.1 Running / Registered Node(s): 1/1 High Availability Enabled: False Linked Count: 0 Queue Length: 0 Average Queue Duration: 0.00s The integration runtime has the following node details: Name: X-M Status: Running Version: 4.4.7292.1 Available Memory: 7697MB CPU Utilization: 6% Network (In/Out): 1.21KBps/0.83KBps Concurrent Jobs (Running/Limit): 2/14 Role: Dispatcher/Worker Credential Status: In Sync If the X-M mode becomes unavailable, all executed pipelines will Lowered Raised Left as is None 80. You are designing an Azure Data Factory solution that will download up to 5 TB of data from several REST APIs. The solution must meet the following Staging requirements: 1. Ensure that the data can be landed quickly and in parallel to a staging area. 2. Minimize the need to return to the API sources to retrieve the data again should a later activity in the pipeline fail. What should you include in the solution to meet the staging requirements? Azure Synapse Analytic Azure Blob Storage Azure SQL Analytic Azure Event Hub None 81. You plan to query an Azure Cosmos DB for NoSQL container from an Azure Synapse Analytics serverless SQL pool by using T-SQL. The query uses the WITH clause to specify the path to property values that use the full fidelity schema. What should you include in the WITH clause? An explicit type conversion A column name and a data type An array None 82. In T-SQL, if you use the OPENJSON function without explicitly defining the output schema, the result will be a table that has the following three columns: Key,Value,Type True False None 83. An Azure Data Factory Pipeline can execute an Azure Databricks notebook. False True None 84. A company has a SaaS solution that uses Azure SQL Database with elastic pools. The solution contains a dedicated database for each customer organization. Customer organizations have peak usage at different periods during the year. you need to implement the Azure SQL Database elastic pool to minimize cost. Which option or options should you configure? eDTUs and max data size Number of databases only eDTUs per database only CPU usage only None 85. In dynamic data masking in Azure Synapse Analytics, the masking function defines the rules that specify which designated fields are masked and how they are masked. False True None 86. You are designing a real-time stream processing solution in Azure Stream Analytics. The solution must read data from a blob container in an Azure Storage account via a service endpoint. You need to recommend an authentication mechanism for the solution. What should you recommend? Shared access signature A managed identity An account key A user-assigned managed identity None 87. You have an Azure Active Directory (Azure AD) tenant that contains a security group named Group1. You have an Azure Synapse Analytics dedicated SQL pool named dw1 that contains a schema named schema1. You need to grant Group1 read-only permissions to all the tables and views in schema1. The solution must use the principle of least privilege. Which three actions should you perform in sequence? Create a database role named Role1 and grant Role1 SELECT permissions to schema1 Create a database role named Role1 and grant Role1 SELECT permissions to dw1 Assign the Azure role-based access control (Azure RBAC) Reader role for dw1 to Group1 Create a database user in dw1 that represents Group1 and uses FROM EXTERNAL PROVIDE clause Assign Rol1 to the Group database user 88. When using the Data Flow Debug feature with Mapping Data Flows, each debug session that is started from the Azure Data Factory User Interface is considered a new session with its own Spark cluster. False True None 1 out of 22 Are you sure, you would like to submit your responses on DP-203 Practice Test and view your results? Time's up