Monitoring Azure Synapse Analytics Workloads Using DMVs

Introduction

In this article we will look at Dynamic Management Views and how can we leverage them to monitor the workloads in an azure synapse analytics workload. We will learn this today with a practical use case and few examples focussing on synapse workload monitoring.

Dynamic Management Views

Dynamic Management View or simply called DMVs are nothing but inbuilt functions that can help you query and return the server state or database specific metadata information that you can use to monitor your workloads and the health of your synapse workloads. You can use these details for tuning the performance, finding out for any errors, diagnose the warnings and other issues in the server instance or a particular database.

Pre-requisites

You will need view database server state level of privilege for running the database scoped DMVs that we are going to use for this session.

GRANT VIEW DATABASE STATE TO <username>

You must have a dedicated SQL pool to execute the queries as you cannot do this in a built-in serverless pool.

Monitoring Connections

The login sessions to your synapse workload are logged to sys.dm_pdw_exec_sessions. The DMV contains the recent 10k login details, and it assigns the session_id sequentially for each new login session.

SELECT * FROM sys.dm_pdw_exec_sessions where status <> 'Closed' and session_id <> session_id();

Monitoring Query Executions

The recent 10k queries that are executed are logged into this DMV sys.dm_pdw_exec_requests. Querying this DMV for a selected session_id shows all queries that have been run for a login. The request_id column that is present in this DMV helps uniquely identify each query and acts as a primary key for this DMV. It is assigned sequentially for each new query and is prefixed with QID which stands for query ID.

Use the following step by step approach to investigate the query execution plan for a particular query.

Step 1

-- Monitor active queries
SELECT *
FROM sys.dm_pdw_exec_requests
WHERE status not in ('Completed','Failed','Cancelled')
  AND session_id <> session_id()
ORDER BY submit_time DESC;

-- Find top 10 queries longest running queries
SELECT TOP 10 *
FROM sys.dm_pdw_exec_requests
ORDER BY total_elapsed_time DESC;

From the above query take down the request ID of the query which you want to investigate. You can simply assign a comment called LABEL to your query which can be used to look up in the sys.dm_pdw_exec_requests DMV.

-- Query with Label
SELECT *
FROM sys.tables
OPTION (LABEL = 'newQuery')
;

-- Find a query with your label 'My Query'
-- Use brackets when querying the label column, as it's a key word
SELECT  *
FROM    sys.dm_pdw_exec_requests
WHERE   [label] = 'newQuery';

Step 2

Use the request ID you have taken from the above step and query it in request_steps DVMs as below to find the distributed SQL plan. It also helps to get information like row_count, estimated_rows, command, operation_type and distribution_type amount many other objects available for querying.

-- Find the distributed query plan steps for a specific query. Replace the request_id with value 
from Step 1.

SELECT * FROM sys.dm_pdw_request_steps
WHERE request_id = 'QID####'
ORDER BY step_index;

Monitoring Waiting Queries

Sometimes the query you trying to execute will not make a progress or will make very little waiting for resources, which happens most of the time in a prod environment. The below query shows the exact resources a query is waiting for and lets you decide the next course of action. Note that if your query is actively waiting on resources from other queries then it will ‘AcquireResources’ state and if it has all the required resources, then it will be in ‘Granted’ state.

-- Find queries
-- Replace request_id with value from Step 1.
SELECT waits.session_id,
      waits.request_id,
      requests.command,
      requests.status,
      requests.start_time,
      waits.type,
      waits.state,
      waits.object_type,
      waits.object_name
FROM   sys.dm_pdw_waits waits
   JOIN  sys.dm_pdw_exec_requests requests
   ON waits.request_id=requests.request_id
WHERE waits.request_id = 'QID####'
ORDER BY waits.object_name, waits.object_type, waits.state;

Summary

These are some of the DMVs that are helpful in monitoring the synapse workloads in live environment, this will be continued by part two in coming days which will have another important set of DMVs.

References

Concepts are referred from Microsoft official documentation.

All the queries used in this demo are sourced from official microsoft documentation.