Adaptive Joins In SQL Server

Monica Rathbun
5y
6.7k
0
11

Article

SQL Server 2017 (compatibility 140) brought us many Intelligent Query Processing (IQP), formally known as Adaptive Query Processing, features that improve performance on workloads straight out of the box with no code changes to implement. One of those features introduced was Adaptive Joins. In this feature the join operator is dynamically determined at runtime by the optimizer, which defines a threshold number of rows and then chooses between a Nested Loop or Hash Match join operator. This operator switch can produce a better join and improve performance of your queries without you having to lift a finger.

Not all queries will qualify for this new feature. The feature only applies to SELECT statements that would have normally returned a Nested Loop or Hash Match, no other joins are apply. In addition, the query must be run in Batch mode (using a Columnstore Index in the query) or using the SQL Server 2019 Batch Mode on Rowstore feature. To find out more about the latter, I recently blogged about Batch Mode on Rowstore here.

Now let us understand the difference between the two different join operators the optimizer will choose from in the feature.

Hash Match

Creates a hash table (in memory) for required columns for each row then creates a hash for second table and finds matches on each row. It is very expensive and requires a lot of memory resources.

Nested Loop

It performs a search on the inner (smaller) table for each row of the outer (larger) table. Less expensive than a Hash Match and ideal for small row inputs, it is the fastest join operator that requires the least I/O with the fewest rows having to be compared.

For this feature a new operator was introduced to show us that an Adaptive Join was used, and the properties give us details on how it determined which join to use.

At runtime if the row count is smaller than the Adaptive Threshold of rows a Nested Loop will be chosen. If it is larger than the threshold it will choose a Hash Match, it is that simple. This can be great for workloads the fluctuate between small and large row inputs for the same query. Note the screen shot below. Using estimations, the plan would have returned a Hash Match but during actual executions it dynamically changed to Nested loop.

The adaptive threshold rows is determined is based on operator cost. The optimizer will evaluate each operator cost using an algorithm for the join operation. Where that cost intersects (the row count tipping point) is what it uses to determine the threshold. Microsoft Docs gives us a good image of this.

Like with any SQL Server feature you have the ability to turn it off by disabling it if you find it is not providing any performance gains or is causing query regressions within your environments.

-- SQL Server 2017
ALTER DATABASE SCOPED CONFIGURATION SET DISABLE_BATCH_MODE_ADAPTIVE_JOINS = ON;
-- Azure SQL Database, SQL Server 2019 and higher
ALTER DATABASE SCOPED CONFIGURATION SET BATCH_MODE_ADAPTIVE_JOINS = OFF;

Intelligent Query Processing gives us many new “auto” fixes for our queries Adaptive Joins is one that has piqued my interest as a database administrator that loves performance tuning. Improper JOIN choices made by the optimizer can really hinder performance, which would likely require me to implement query hints or plan guides. I really like that SQL Server is now making automatic intelligent decision and fixing them for me on the fly without my intervention.