SQL Server 2019 Intelligent Query Processing: Approximate QP with Approx_Count_Distinct

In the previous post, we learned about Scalar UDF Inlining feature. In this post, let’s focus on the Approximate QP with Approx_Count_Distinct feature introduced as a part of SQL Server 2019 Intelligent Query Processing features. This feature is useful to get the approximate count of distinct values just like the  Count distinct function to get the distinct number of records but this new feature will take less amount of CPU and memory to process the query.

First let’s see with an example with the regular COUNT(DISTINCT Column) works and check for the number of rows as a result. For this below example, we are using AdventureworksDW2016 database.

SET STATISTICS TIME ON
select count(distinct LastName) from [dbo].[DimEmployee

Now, run the same query using Approx_Count_Distinct and see the distinct count

SET STATISTICS TIME ON
select APPROX_COUNT_DISTINCT(lastname) from [dbo].[DimEmployee]

We see the distinct count value is less by one when compared with the query using the COUNT(DISTINCT Column). Compare the execution plans for both the queries by choosing the select properties. When you check the compile CPU, memory and time comparing both the plans, query run with the Approx_Count_Distinct have lower values when compared with the other plan. The performance of query run with the new feature using Approx_Count_Distinct is better when compared with the query using COUNT(DISTINCT Column).

Let’s check with an another example using Adventureworks2016 database.

SET STATISTICS TIME ON
select count(distinct SalesOrderID) from [Sales].[SalesOrderDetailEnlarged]

This time, try to run the query using the Approx_Count_Distinct and check for the results count

When you compare the results, there is a little bit variation in the results. The query using Approx_Count_Distinct produced 27997 more than the number produced by the query using COUNT(DISTINCT Column). This feature will produce the approximate distinct column count by taking less resources. Let’s check the resource utilization by comparing both the actual execution plans produced by both the queries side by side. Bottom execution plan is generated by the COUNT(DISTINCT Column) which took 1.084s to perform the index scan and the top execution plan generated by using the Approx_Count_Distinct took 0.374s which is very less. Also check the properties of the select to see the difference in the compile CPU, memory and compile time for both the actual execution plans. Approx_Count_Distinct query took more values when compares to the query using COUNT(DISTINCT Column).

The approximate values do not have to always be more. There may be cases where query using Approx_Count_Distinct results in less distinct values when compared with the query using COUNT(DISTINCT Column). Let’s see this with a different example using COUNT(DISTINCT Column) first.

SET STATISTICS TIME ON
select count(distinct CreditCardID) from [Sales].[SalesOrderHeaderEnlarged]

Run the same query using the Approx_Count_Distinct

SET STATISTICS TIME ON
select APPROX_COUNT_DISTINCT(CreditCardID) from [Sales].[SalesOrderHeaderEnlarged]

This time we had less number of distinct count using Approx_Count_Distinct when compared with previously run query using COUNT(DISTINCT Column). This column CreditcardID is a nullable column and have null values. When you compare the actual execution plan select properties, we can see this time the CPU, memory time is less for the query using Approx_Count_Distinct when compared with the query plan using COUNT(DISTINCT Column).

Summary:

By using the Approximate QP feature with Approx_Count_Distinct, we can get distinct non nullable values in SQL Server with better performance than using COUNT(DISTINCT Column). In this blog post, we have seen both the scenarios showing the less values and more distinct values using this new feature. In three examples we have seen in this post, two sample queries were having performance boost while running the query using the new feature Approx_Count_Distinct with less compile CPU, Memory and time.

Thanks for Reading!

Server 2019 Intelligent Query Processing: Scalar UDF Inlining

In the previous post, we learned about Batch Mode on Row Store feature. In this post, let’s focus on Scalar UDF Inlining feature introduced as a part of SQL Server 2019 Intelligent Query Processing features. This feature is really useful for improving the performance of scalar user-defined function (UDF) inlining without have to modify any code.

Source: https://www.pexels.com/photo/macro-photography-of-tree-235615/

Scalar user defined function will return a single value by taking one or more parameters. For example, if you use a calculation that be used in many number of your queries, instead of repeating the code each time for the queries, we can place the same code inside the function and use the same function in all queries requiring the same type of calculations. These functions basically simplify the code. Scalar functions are good when they are used for small set of rows but might cause performance impact with large amount of data.

First let’s run a sample query on Adventureworks2016 database without and with the UDF function in it and see the I/O information

SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT
Product.Name AS PName,
Product.ProductNumber AS PNumber,
Product.StandardCost AS StandardPrice,
History.EndDate as Enddate
FROM  Production.Product Product
INNER JOIN Production.ProductListPriceHistory History
ON Product.ProductID = History.ProductID

Look at the number of the logical reads for both the tables, ProductListPriceHistory, Product table and total elapsed time.

Let’s run the same code this time adding just another UDF function to it and look at the information again.

SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT
Product.Name AS PName,
Product.ProductNumber AS PNumber,
Product.StandardCost AS StandardPrice,
History.EndDate as Enddate,
dbo.ufnGetProductListPrice(Product.ProductID,History.StartDate) as Listprice
FROM  Production.Product Product
INNER JOIN Production.ProductListPriceHistory History
ON Product.ProductID = History.ProductID

When you observe the Logical read for ProductListPriceHistory, Product table and total elapsed time its more this time (76ms).

The logical reads information is the same but the difference here is the elapsed time. Its a little difference here for my 395 rows but this time can differ significantly when you are pulling large number of rows. We cannot get the accurate I/O statistics information for scalar UDF functions. Scalar valued UDF also cannot calculate the Execution cost information accurately. When you check the execution plan of the query we executed before with the UDF in it, we can see the cost shown as 0% for the compute scalar operator but when you check the properties for the select we can see the time for the UDF (UdfElapsedTime) is almost 70%. This cost is not properly estimated by the optimizer causing the performance issues for queries using scalar UDF’s.

Scalar UDF’s always run the query in serial and not parallel. If you have a scalar UDF in your query but that not actually touching any of your data (Like for example, UDF getting the Getdate()), even then you can see that causing the performance issue because the query runs serially. Here in this above query execution plan, the compute scalar operator invokes UDF for 395 times because UDF also runs once per row. UDF query plan is not shown in the actual query as it could generate a different query plan for each execution which can be 395 times in this case.

Scalar UDF Inlining

Scalar UDF inlining feature will convert the UDF query into simple subquery that is placed inline into the code we have in our query with APPLY statement. All these operations happen during the query runtime so we would not have to change any code. To enable the new feature on SQL Server 2019, we need to have the database compatibility to 150 and run the same query having UDF and check the elapsed time now. It is 60sec.

ALTER DATABASE AdventureWorks2016 SET COMPATIBILITY_LEVEL = 150;

SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT
Product.Name AS PName,
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT
Product.Name AS PName,
Product.ProductNumber AS PNumber,
Product.StandardCost AS StandardPrice,
History.EndDate as Enddate,
dbo.ufnGetProductListPrice(Product.ProductID,History.StartDate) as Listprice
FROM  Production.Product Product
INNER JOIN Production.ProductListPriceHistory History
ON Product.ProductID = History.ProductID

Now when you check the actual execution plan, you cannot see the UDF being invoked because of the inlining feature. You will not find the UdfCpuTime, UdfElapsedTime in the select properties as UDF won’t be invoked.

We can disable or enable the scalar UDF feature on the database level, function level and also at the query level.

Database level:

ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = ON/OFF

Query Level using a HINT:

OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'));

Function Level:

CREATE OR ALTER FUNCTION UDF(@Parameter1 [int])
RETURNS [int] 
WITH INLINE = OFF /ON
AS 
BEGIN
--Code here
END

For this new Scalar UDF inlining feature, there are few requirements that the scalar UDF functions needs to meet. You can check if the function is inlineable or not by using the sys.sqlmodules and look for the column is_inlineable.

Conclusion:

Scalar UDF Inlining feature is introduced to improve the performance of your queries invoking the scalar UDF’s where the execution is performance bottleneck. As this new feature transform the UDF into scalar expression or scalar subqueries that are placed inline in the code of the UDF so the query gets called replacing the UDF operator. These expressions and subqueries are then optimized by the optimizer. As a result, we would not be able to see the UDF operator inside our execution plan providing the performance boost.

Thank you for Reading!

SQL Server 2019 Intelligent Query Processing: Batch Mode on Row Store

In the previous post, we learned about Table variable deferred compilation. In this blog, lets focus on the batch mode on rowstore feature introduced in SQL Server 2019. This feature improves the performance of the analytical queries using the batch mode query processing. This feature is for CPU optimization helping analytical queries to run faster. We do not have to specify this option if the database compatibility is 150.

This feature is especially for the analytical queries for CPU bound analytic workloads without needing the columnstore indexes. We can specifically mention the hints in the query for using the batch mode or not.

Columnstore Indexes are used for the data warehouses and analytical purposes improving the I/O and optimizing the CPU through the batch mode execution query processing. Columstore indexes are not suitable for the OLTP systems as these causes overhead for the deletes and update operations. Using batch mode on row store feature will help solve this problem for the OLTP analytical workloads.

Row mode execution will processes are performed on row by row basis where as the batch mode will process as a batch of 900 rows. This is a huge benefit as this feature will use the CPU efficiently.

First, Let’s see with an example of how disabling the batch mode on rowstore option works by running the below query turning the feature OFF

USE AdventureWorks2016Big
GO
ALTER DATABASE SCOPED CONFIGURATION SET BATCH_MODE_ON_ROWSTORE = OFF;

Then run the below query and check the actual execution plan

SELECT ModifiedDate,CarrierTrackingNumber ,
SUM(OrderQty*UnitPrice) FROM Sales.SalesOrderDetailEnlarged
GROUP BY ModifiedDate,CarrierTrackingNumber

When you observe the query plan and hover over the clustered index scan, the actual execution mode is ‘Row’ mode and it has processed 9341409 rows.

Now, enable the Batch mode by using the HINT in the same query and run again. You can also enable the Batch mode on database level option to turn back ON (as we turned OFF for the previous example)

SELECT ModifiedDate,CarrierTrackingNumber ,
SUM(OrderQty*UnitPrice) FROM Sales.SalesOrderDetailEnlarged
GROUP BY ModifiedDate,CarrierTrackingNumber
OPTION(USE HINT('ALLOW_BATCH_MODE'))

There are some limitations that will not allow the use of the batch mode processing like OLTP tables, Indexes that doesn’t have B-Tree structures, heaps, LOB columns, XML and Sparse columns. Batch mode on Row store feature will significantly improve the analytical workloads with queries having groupby, aggregations or sorts operations. Queries which will result in one few rows or queries which do not do any aggregation operations will not see much performance gains with this feature.

Summary:

In this post, we learned about Batch Mode on Row Store feature of SQL Server 2019. This feature will help the analytical workloads and use CPU effectively. If your databases are in 150 compatibility mode, then you are already taking advantage of this new feature.

In the next blog post, we will learn about the next Intelligent Query Processing feature T-SQL Scalar UDF Inlining Feature

Thanks for Reading!

SQL Server 2019 Intelligent Query Processing: Table variable deferred compilation

In our last post, we have seen one of the feature of Intelligent Query processing(Memory Grant Feedback feature). In this post, lets focus on the Table variable deferred compilation.

With SQL Server 2017 and below versions, SQL Server always estimates one row for table variable as the table variable data gets inserted during the run time, so optimizer doesn’t know how many values it can expect coming out of the table variable. Due to this bad estimation, performance of the queries is effected.

Let’s see this behavior with an example:

Place the database in 140 compatibility mode (SQL Server 2017) and run below query using a table variable enabling the actual execution plan

SET STATISTICS IO ON;
SET STATISTICS TIME ON;
DECLARE @IDinfo TABLE
([ID] INT, 
 [Person_id] [int]
);
INSERT INTO @IDinfo
       SELECT [ID], 
            [Person_id]  
       FROM test;
SELECT top 100 *
FROM @IDinfo F1
     JOIN test F2 ON F1.[ID] = F2.[ID];

Check the execution plan and see the estimated number of rows shows as 1 but actual number of rows were 100.

In versions later than SQL Server 2012 SP2, we have a trace flag 2453 which we can turn on so the table variable gets recompiled when the number of rows changes. Let’s run the same query by turning on the trace flag. This can be done at the query level or you can turn on the trace flag at the server level by using the command DBCC TRACEON(2453,-1).


DBCC TRACEON(2453);
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
DECLARE @IDinfo TABLE
([ID] INT, 
 [Person_id] [int]
);
INSERT INTO @IDinfo
       SELECT [ID], 
            [Person_id]  
       FROM test;
SELECT top 100 *
FROM @IDinfo F1
     JOIN test F2 ON F1.[ID] = F2.[ID];

Check the execution plan and see the estimated number of rows shows as 100 and actual number of rows were 100 which is 100% accurate estimate.

We can also recompile the query by using the OPTION(RECOMPILE) in the statement which will recompile the statement at every execution.

SQL Server 2019 Feature: Table variable deferred compilation

As we have seen in the previous examples, SQL Server estimates one row for table variables SQL Server 2017 and earlier versions of SQL Server. In SQL Server 2019, a new feature called Table variable deferred compilation compilation of the table variable statement is deferred until the first execution. This will help the optimizer to estimate the accurate number of rows which will help improve the query performance. This feature is available in SQL Server 2019 databases with compatibility level 150.

Change the database compatibility mode to 150 and run the same query one more time and check the execution plan

SQL Server will be able to estimate the number of rows perfectly.

Summary: With this new feature Table variable deferred compilation in SQL Server 2019, we don’t have a need to use trace flag 2453 or use OPTION(RECOMPILE) to specifically recompile the statements using the table variable. If you have queries with table variables that regularly gets populated with relatively same number of rows, this feature will get the better estimates with out we changing our code. However, if the number of rows that populate the table variable changes greatly and if the queries doesn’t cause the high compilation times, then considering this new feature along with OPTION(RECOMPILE) is helpful.

Thanks for reading!

SQL Server 2019 Intelligent Query Processing: Memory Grant Feedback

In our previous posts of Memory Grant Internals, we have learned all about the memory grant internals. We have learned from where do they come from and how SQL Server will manage the memory for the memory grants.

In this series, we will learn all about SQL Server 2019 Intelligent Query processing.

Intelligent query processing is a set of features which helps improve the performance of the queries with very less effort in implementation. Below flow chart shows all the features that were released in SQL Server 2017 and SQL Server 2019 related to Intelligent query processing. In this series of posts, lets learn about each Intelligent query processing feature released in SQL Server 2019.

Source: Microsoft

In this part, let us focus on the Memory Grant Feedback feature released in SQL Server 2017 for Batch mode execution and in SQL Server 2019 for Row mode execution.

Memory Grants are used by the SQL Server for the Hashes and Sort operations. SQL Server optimizer uses the statistics information and allocate the memory needed by the query before the query executes. When the query is executed, SQL Server uses the allocated memory to process the query for the hashes and sorts. If this memory grant is not enough to process the query, data will use tempdb spilling to disk. When too much memory is allocated based up on the estimates, we can effect the concurrency as all other queries requires memory grants to process the queries as well. Bad estimates can effect the memory grants allocated to the queries. Too much or too little memory grants is bad.

Now, how do we solve this issue?

By making sure we maintain the statistics up to date. Sometimes, we may see the queries spilling to disk due to low memory grant allocation than needed, in this case its really not possible to stop queries spilling to disk if we are in SQL Server 2016 or below versions. If you are on SQL Server 2019, Memory Grant Feedback is enabled on databases by default.

Memory Grant Feedback feature will collect the information of the memory that has actually spilled to disk and add that memory to the next executions of the query plus a possible buffer. In reverse if more memory grants are allocated to the queries than needed, Memory Grant Feedback feature will save the information in the query plan of how much memory has not being used and decrease that memory to the next executions plus a possible buffer. This feature adjust these grants as needed by the query.

Memory Grant Feedback is really helpful as this feature is especially to estimate the necessary amount of memory grants needed for the query to run without any performance degradation.

Let’s see how the queries work without Memory Grant feedback. As my database is in SQL Server 2019 compatibility mode, lets turn off the feature first, create a stored procedure and execute the stored procedure with a date range values for a year. Check for the actual execution plan.

ALTER DATABASE SCOPED CONFIGURATION 
	SET ROW_MODE_MEMORY_GRANT_FEEDBACK = OFF;
GO
DROP PROCEDURE IF EXISTS [Sales].[customerdata_OrderDate];
GO

CREATE OR ALTER PROCEDURE [Sales].[customerdata_OrderDate]
	@StartDate DATETIME,
	@EndDate DATETIME
AS
SELECT
	oh.[CustomerID],
	oh.[OrderDate],
	oh.[ShipMethodID],
	od.[OrderQty],
	od.[ProductID]
FROM [Sales].[SalesOrderHeaderEnlarged] oh
JOIN 
[Sales].[SalesOrderDetailEnlarged] od
ON oh.SalesOrderID = od.salesOrderID
WHERE [OrderDate] >= @StartDate 
	AND [OrderDate] <= @EndDate
ORDER BY [OrderDate];
GO
DECLARE @StartDate DATETIME = '2012-01-01'
DECLARE @EndDate DATETIME = '2013-03-31'

EXEC [Sales].[customerdata_OrderDate] @StartDate, @EndDate;
GO

When you observe the actual execution plan, you see the warning that sort operator used tempdb to spill the data and wrote 2234 pages to disk.

SQL Server allocated memory grant for this query is not sufficient so it had to spill to disk. As the memory is granted to any query before execution, SQL Server cannot grant the memory on the fly during the query execution. This results in sorting data in tempdb which is I/O intensive. This causes queries to run very slow impacting query performance.

Now let’s enable the Memory grant feature to ON and run the query again. Check actual execution plan.

ALTER DATABASE SCOPED CONFIGURATION SET ROW_MODE_MEMORY_GRANT_FEEDBACK = ON;
GO

Sort did not spill to disk this time. Look at the Memory Grant Feedback info from the properties. IsMemoryGrantFeedbackAdjusted shows as ‘NoFirstexecustion’. This is because we executed the stored procedure for the first time after we enabled the memory grant feedback feature on the database and that is the reason we also see LastrequestedMemory as 0. The requested memory shows as 117704KB.

Let’s run the same query one more time and see the IsMemoryGrantFeedbackAdjusted status. Now the status shows as ‘YesAdjusting’. The last requested memory is 117704KB which is exact memory we have seen for our last execution RequestedMemory. This time the RequestedMemory is 75016KB.

Let’s run the query once again and see the status. Now the status shows as ‘YesStable’. The last requested memory is 75016KB which is exact memory we have seen for our last execution RequestedMemory. This time the RequestedMemory is the same 75016KB as now the status shows as stable.

Isn’t it wonderful? OH YES!! but we need to remember few things here. What if we have lot of skew in our data and if the results drastically differ with each variable? Like for example, if your stored procedure gives 1000 rows with a variable and 1 million rows for different variable. If you have to execute the store procedure frequently, then the memory grant feedback tries to adjust itself to allocate the memory required by the query but as these results differ in many rows, memory grant feedback cannot adjust itself, it goes back and forth changing the memory grants and eventually turns itself OFF.

To describe this scenario, lets take an example of the same query we executed before but this time we run two different executions. First execution of the stored procedure with one year date range results shows as 600,000 rows and second query results in 6,000 rows. Let us the run these queries 20 times. Total 40 executions.

EXEC [Sales].[customerdata_OrderDate] '2012-01-01', '2013-03-31'
go 20
EXEC [Sales].[customerdata_OrderDate] '2012-01-01', '2012-01-08'
go 20

On the 22nd execution, I see the Memory Grant feedback info status shows as ‘YesAdjusting’ but when you check the 23rd execution, status shows as ‘NoFeedbackDisabled’. Once it shows as disabled, that means the memory grant feedback is turned off for this execution plan.

Points to Remember: Feedback will be change only the Cached plan. For the first execution, the memory grant feedback information will be stored inside the cached plan. In the next executions of the same query, the benefits of the feedback adjustments will be used. Feedback information is not captured in the query store. If there is memory pressure or if in any case the cached plan is evicted out of the cache, feedback will be lost for that evicted plan. If there is a failover, in that case feedback will be lost.

Note: Memory Grant Feedback is not the replacement for the out of date statistics.

Summary: Memory Grant Feedback feature is really useful in scenarios like pulling reports every couple of months with different variables. If you have lot of skew in your data, results differ a lot with each variable and you need to run the query very frequently, in that case Memory Grant feedback may not be helpful as the feature will turn itself OFF at certain point of time while trying to adjust itself. We need to make sure our statistics are up to date and we maintain them regularly.

Thank you for Reading!