SQL Server 2019 Intelligent Query Processing: Table variable deferred compilation

In our last post, we have seen one of the feature of Intelligent Query processing(Memory Grant Feedback feature). In this post, lets focus on the Table variable deferred compilation.

With SQL Server 2017 and below versions, SQL Server always estimates one row for table variable as the table variable data gets inserted during the run time, so optimizer doesn’t know how many values it can expect coming out of the table variable. Due to this bad estimation, performance of the queries is effected.

Let’s see this behavior with an example:

Place the database in 140 compatibility mode (SQL Server 2017) and run below query using a table variable enabling the actual execution plan

SET STATISTICS IO ON;
SET STATISTICS TIME ON;
DECLARE @IDinfo TABLE
([ID] INT, 
 [Person_id] [int]
);
INSERT INTO @IDinfo
       SELECT [ID], 
            [Person_id]  
       FROM test;
SELECT top 100 *
FROM @IDinfo F1
     JOIN test F2 ON F1.[ID] = F2.[ID];

Check the execution plan and see the estimated number of rows shows as 1 but actual number of rows were 100.

In versions later than SQL Server 2012 SP2, we have a trace flag 2453 which we can turn on so the table variable gets recompiled when the number of rows changes. Let’s run the same query by turning on the trace flag. This can be done at the query level or you can turn on the trace flag at the server level by using the command DBCC TRACEON(2453,-1).


DBCC TRACEON(2453);
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
DECLARE @IDinfo TABLE
([ID] INT, 
 [Person_id] [int]
);
INSERT INTO @IDinfo
       SELECT [ID], 
            [Person_id]  
       FROM test;
SELECT top 100 *
FROM @IDinfo F1
     JOIN test F2 ON F1.[ID] = F2.[ID];

Check the execution plan and see the estimated number of rows shows as 100 and actual number of rows were 100 which is 100% accurate estimate.

We can also recompile the query by using the OPTION(RECOMPILE) in the statement which will recompile the statement at every execution.

SQL Server 2019 Feature: Table variable deferred compilation

As we have seen in the previous examples, SQL Server estimates one row for table variables SQL Server 2017 and earlier versions of SQL Server. In SQL Server 2019, a new feature called Table variable deferred compilation compilation of the table variable statement is deferred until the first execution. This will help the optimizer to estimate the accurate number of rows which will help improve the query performance. This feature is available in SQL Server 2019 databases with compatibility level 150.

Change the database compatibility mode to 150 and run the same query one more time and check the execution plan

SQL Server will be able to estimate the number of rows perfectly.

Summary: With this new feature Table variable deferred compilation in SQL Server 2019, we don’t have a need to use trace flag 2453 or use OPTION(RECOMPILE) to specifically recompile the statements using the table variable. If you have queries with table variables that regularly gets populated with relatively same number of rows, this feature will get the better estimates with out we changing our code. However, if the number of rows that populate the table variable changes greatly and if the queries doesn’t cause the high compilation times, then considering this new feature along with OPTION(RECOMPILE) is helpful.

Thanks for reading!

SQL Server 2019 Intelligent Query Processing: Memory Grant Feedback

In our previous posts of Memory Grant Internals, we have learned all about the memory grant internals. We have learned from where do they come from and how SQL Server will manage the memory for the memory grants.

In this series, we will learn all about SQL Server 2019 Intelligent Query processing.

Intelligent query processing is a set of features which helps improve the performance of the queries with very less effort in implementation. Below flow chart shows all the features that were released in SQL Server 2017 and SQL Server 2019 related to Intelligent query processing. In this series of posts, lets learn about each Intelligent query processing feature released in SQL Server 2019.

Source: Microsoft

In this part, let us focus on the Memory Grant Feedback feature released in SQL Server 2017 for Batch mode execution and in SQL Server 2019 for Row mode execution.

Memory Grants are used by the SQL Server for the Hashes and Sort operations. SQL Server optimizer uses the statistics information and allocate the memory needed by the query before the query executes. When the query is executed, SQL Server uses the allocated memory to process the query for the hashes and sorts. If this memory grant is not enough to process the query, data will use tempdb spilling to disk. When too much memory is allocated based up on the estimates, we can effect the concurrency as all other queries requires memory grants to process the queries as well. Bad estimates can effect the memory grants allocated to the queries. Too much or too little memory grants is bad.

Now, how do we solve this issue?

By making sure we maintain the statistics up to date. Sometimes, we may see the queries spilling to disk due to low memory grant allocation than needed, in this case its really not possible to stop queries spilling to disk if we are in SQL Server 2016 or below versions. If you are on SQL Server 2019, Memory Grant Feedback is enabled on databases by default.

Memory Grant Feedback feature will collect the information of the memory that has actually spilled to disk and add that memory to the next executions of the query plus a possible buffer. In reverse if more memory grants are allocated to the queries than needed, Memory Grant Feedback feature will save the information in the query plan of how much memory has not being used and decrease that memory to the next executions plus a possible buffer. This feature adjust these grants as needed by the query.

Memory Grant Feedback is really helpful as this feature is especially to estimate the necessary amount of memory grants needed for the query to run without any performance degradation.

Let’s see how the queries work without Memory Grant feedback. As my database is in SQL Server 2019 compatibility mode, lets turn off the feature first, create a stored procedure and execute the stored procedure with a date range values for a year. Check for the actual execution plan.

ALTER DATABASE SCOPED CONFIGURATION 
	SET ROW_MODE_MEMORY_GRANT_FEEDBACK = OFF;
GO
DROP PROCEDURE IF EXISTS [Sales].[customerdata_OrderDate];
GO

CREATE OR ALTER PROCEDURE [Sales].[customerdata_OrderDate]
	@StartDate DATETIME,
	@EndDate DATETIME
AS
SELECT
	oh.[CustomerID],
	oh.[OrderDate],
	oh.[ShipMethodID],
	od.[OrderQty],
	od.[ProductID]
FROM [Sales].[SalesOrderHeaderEnlarged] oh
JOIN 
[Sales].[SalesOrderDetailEnlarged] od
ON oh.SalesOrderID = od.salesOrderID
WHERE [OrderDate] >= @StartDate 
	AND [OrderDate] <= @EndDate
ORDER BY [OrderDate];
GO
DECLARE @StartDate DATETIME = '2012-01-01'
DECLARE @EndDate DATETIME = '2013-03-31'

EXEC [Sales].[customerdata_OrderDate] @StartDate, @EndDate;
GO

When you observe the actual execution plan, you see the warning that sort operator used tempdb to spill the data and wrote 2234 pages to disk.

SQL Server allocated memory grant for this query is not sufficient so it had to spill to disk. As the memory is granted to any query before execution, SQL Server cannot grant the memory on the fly during the query execution. This results in sorting data in tempdb which is I/O intensive. This causes queries to run very slow impacting query performance.

Now let’s enable the Memory grant feature to ON and run the query again. Check actual execution plan.

ALTER DATABASE SCOPED CONFIGURATION SET ROW_MODE_MEMORY_GRANT_FEEDBACK = ON;
GO

Sort did not spill to disk this time. Look at the Memory Grant Feedback info from the properties. IsMemoryGrantFeedbackAdjusted shows as ‘NoFirstexecustion’. This is because we executed the stored procedure for the first time after we enabled the memory grant feedback feature on the database and that is the reason we also see LastrequestedMemory as 0. The requested memory shows as 117704KB.

Let’s run the same query one more time and see the IsMemoryGrantFeedbackAdjusted status. Now the status shows as ‘YesAdjusting’. The last requested memory is 117704KB which is exact memory we have seen for our last execution RequestedMemory. This time the RequestedMemory is 75016KB.

Let’s run the query once again and see the status. Now the status shows as ‘YesStable’. The last requested memory is 75016KB which is exact memory we have seen for our last execution RequestedMemory. This time the RequestedMemory is the same 75016KB as now the status shows as stable.

Isn’t it wonderful? OH YES!! but we need to remember few things here. What if we have lot of skew in our data and if the results drastically differ with each variable? Like for example, if your stored procedure gives 1000 rows with a variable and 1 million rows for different variable. If you have to execute the store procedure frequently, then the memory grant feedback tries to adjust itself to allocate the memory required by the query but as these results differ in many rows, memory grant feedback cannot adjust itself, it goes back and forth changing the memory grants and eventually turns itself OFF.

To describe this scenario, lets take an example of the same query we executed before but this time we run two different executions. First execution of the stored procedure with one year date range results shows as 600,000 rows and second query results in 6,000 rows. Let us the run these queries 20 times. Total 40 executions.

EXEC [Sales].[customerdata_OrderDate] '2012-01-01', '2013-03-31'
go 20
EXEC [Sales].[customerdata_OrderDate] '2012-01-01', '2012-01-08'
go 20

On the 22nd execution, I see the Memory Grant feedback info status shows as ‘YesAdjusting’ but when you check the 23rd execution, status shows as ‘NoFeedbackDisabled’. Once it shows as disabled, that means the memory grant feedback is turned off for this execution plan.

Points to Remember: Feedback will be change only the Cached plan. For the first execution, the memory grant feedback information will be stored inside the cached plan. In the next executions of the same query, the benefits of the feedback adjustments will be used. Feedback information is not captured in the query store. If there is memory pressure or if in any case the cached plan is evicted out of the cache, feedback will be lost for that evicted plan. If there is a failover, in that case feedback will be lost.

Note: Memory Grant Feedback is not the replacement for the out of date statistics.

Summary: Memory Grant Feedback feature is really useful in scenarios like pulling reports every couple of months with different variables. If you have lot of skew in your data, results differ a lot with each variable and you need to run the query very frequently, in that case Memory Grant feedback may not be helpful as the feature will turn itself OFF at certain point of time while trying to adjust itself. We need to make sure our statistics are up to date and we maintain them regularly.

Thank you for Reading!

T-SQL Tuesday #135: Tools of the Trade

With the Invitation of Mikey Bronowski for this month T-SQL Tuesday Tools of the Trade, I would like share some useful tools that make my job easier:

  1. OBS Studio: This is a free and open source software for video recording and live streaming. I mostly prerecord my sessions using OBS. I personally love this tool as we have pretty much good content on YouTube that teach us how to use this tool.
  2. SentryOne Plan Explorer: Plan explorer is an amazing tool to analyze your execution plan and tune your queries very quickly. Its completely free.
  3. SQL Search: It is a free search for databases. If you need to find any specific object/column or any string on any database on the server, this tool is very useful. I use SQL search regularly.
  4. SQL Compare: SQL compare tool is very useful to compare the database schema and data between environments. I use this tool regularly.
  5. SQLQueryStress: It is a lightweight performance testing tool, designed to load test individual queries. I personally use this tool for my demos and testing any queries for performance.
  6. Zoomit: ZoomIt is a free screen zoom and annotation tool for presentations. I use it regulary and I am sure most of the speakers already know about the tool.
  7. Diffchecker: A very useful too to compare code/documents.
  8. Adobe Lightroom: This is a photo editing tool which I personally liked. It is professional and used by many photographers across the world.
  9. Free Download Manager: I use this tool for my downloads. It is much faster to download files. I use this tool especially for YouTube downloads. If you need to download the playlists at a time in a single shot, this tool is really useful. It is completely free.
  10. Nimbus Screenshot: It is a chrome extension. This tool is really useful when you wanted to take the screenshot of entire page. You can select the area you wanted to take the screenshot by scrolling the page. Very useful.

These are some of the useful tools I use regularly. I would also like to see the list of the tools others feel useful. I would love to start using those tools as well.

Memory Grant Internals Part 5

This part is the continuation of the series to the post Memory Grant Internals part 4

In this final part of the series, lets talk about how we balance these memory grants. If lot of your queries are requesting for more memory than they required, then that causes concurrency issues. Memory grants are allocated to every query even before they get executed based on the cardinality estimates and memory consuming iterators. Once the query memory is allocated to a query, that memory is only released once the query execution completes even if the query actually uses a part of the allocated query. If more memory is requested by the queries than they need, then we are wasting the memory. What happens if the queries only receive less memory grants than they need, then we there is a performance impact on the running queries.

These issues occurs due to some problems. If your estimates are inaccurate, if the plans that got generated are non-optimal in scenarios where you have skew in your data and creating a plan compiled based on your initial parameter value. For example, if you execute the stored procedure for the first time using a parameter value giving you 100 rows, a plan gets generated and with a next parameter gives you 2 million rows, we still use the same execution plan which will be non-optimal for these parameters having large number of rows. This is also known as parameter sniffing. Sometimes, we have “out of model” situation meaning anything we put into a query plan that the query optimizer isn’t handle to set up properly. We need to make sure we are out of “out of model scenarios” to get good estimates. Sometimes, rewriting of the queries will be helpful as well.

There are few ways where we can fix these problems.

  1. Updating the statistics. Even you have the default auto update statistics set to ON, maintaining the statistics regularly using the maintenance plan
  2. Recompiling the query. Recompiling will help keep away the parameter sniffing issues.
  3. Try to avoid the multi statement TDF’s. From SQL Server 2017, Adaptive query processing is introduced where the estimates for these TDF’s are lined up with the actual number of rows but that is not the case for the older versions of SQL Server.
  4. You can use the OPTION(OPTIMIZE FOR) to influence the optimizer.
  5. Check for the variable length Data types. For example, we have a column with varchar(4000) data type but all your values are just using a single character. The estimate is going to be for the average row size is going to be half. Varchar and other variable length data type columns estimate only 50% of the defined size.
  6. Use the resource Governor. If you have many high grant and low cost queries waiting in the queue, you can categorize them into a workload group by creating a pool for just for these types of queries. If you need concurrency, limiting the grant percentage and the setting the timeout options for resource pool helps.

Lets see with the examples.

  1. Low memory grant even the stats were updated. Run the below query and see the actual execution plan along with the extended event live data to see if it spills to disk.
select top(1000) * from
(
select top (600000)
* from DBO.bigTransactionHistory
)as s
order by ActualCost desc
OPTION (MAXDOP 1)
GO

The memory grant was 55MB which is very less than the maximum grant. There is a spill warning in the event information as well. Why did the spill happen when we have enough memory? Statistics were up to date but may be the estimated row size might differ with the actual and that is the reason it had to spill to disk. In this case, the optimizer was not able to get the good plan for this query.

2. Multi statement table valued function. In the second example, lets see how many rows multi statement variables get estimated. First lets see for the sql server 2012 database compatibility.

SELECT tst.CustomerID, COUNT(*) 
FROM  Sales.Customer c 
INNER JOIN dbo.TEST_MTVF('1/1/2014') tst on c.CustomerID = tst.CustomerID
INNER JOIN Production.Product prod on tst.ProductID = prod.ProductID
GROUP BY tst.CustomerID

When you observe the actual execution plan and hover the mouse over to the rows passing from table to nested loop iterator. The estimated number of rows is 1 but the actual number of rows are 37339 which is a lot. For SQL Server 2012 and below, the estimates to the MTLF is always 1 no matter how many actual number of rows were there.

Now, lets place the database in the sql server 2016 model and see the estimates. Run the same query again and check the actual execution plan to see the estimates. When you observe below, the estimates were showing 100 rows. For SQL Server database compatibility of both 130 (SQL Server 2016) and 120 (SQL Server 2014) estimates for MTVF always show 100 rows no matter how many actual rows might be.

Lets see for the latest versions 2019. Run the same query and check the estimates for MTVF. Now the estimates are exactly lined up with the actual number of rows. This is the same for SQL Server 2017 database compatibility mode (140) That’s because in 2017, Adaptive query processing is introduced which will optimize the execution plan in a better way and get the correct estimates for MTVF’s.

3. Variable length Data types. Lets run the below query which have a varchar datatype and see how the estimates work here:

select *, CONVERT(VARCHAR(8000), 'a') as new
into #B
from bigProduct
GO

select * from #B 
CROSS JOIN
( SELECT 1
UNION ALL 
SELECT 2
UNION ALL
SELECT 3
) AS n (l)
ORDER BY name, l
GO

While running the above query, if you query select * from sys.dm_exec_query_memory_grants, you will see the granted memory is 183MB but used memory is only 19MB. Rest all the memory which is granted is wasted. Based on the varchar(8000), memory grant has been allocated but when in actual only some memory is being used to process the query. Due to this, all the other queries running concurrently have to wait for this query memory to be released. Once the memory has been granted, memory doesn’t be released whether the query uses that memory or not.

4. Using TOP with (optimize for) option. In this example, lets run the below query and see the event information. we can use a variable and use the optimize for option like below. There is no sort spilling warning.

declare @C int=100000
select top(1000) * from
(select top (@C)
*
from sales.SalesOrderHeaderEnlarged
) as a
order by
TotalDue desc
option (MAXDOP 1, OPTIMIZE FOR (@C=150000))
GO

We can use Optimize for option in many ways where we can limit the grant by mentioning less number of rows in optimize for option to get lower amount of grant where as if the memory grant is needed more than optimize for bigger value than we mentioned in the variable, like the see in the above example.

5. We can also change the resource semaphore options request_max_memory_grant_percent to the needed percentage. The default is 25%. We can place the limit on how much percentage the workload group can take. In below example, We can set the limit by altering the workload group with 15% like below:

Alter Workload Group [default] with
(
request_max_memory_grant_percent=15
)
GO
Alter Resource Governor RECONFIGURE
GO

Summary:

In this final part of the series, we have learned how to balance these memory grants with examples. This is our final part for the Memory Grant Internals Series. Too much or Too little memory grants is bad. Balancing the memory is important. During these parts of the series, we started the learning path with memory consuming iterators, how these iterators work and share the memory between them, what is a resource governor, parts of the resource governor, how the spills to disk happens with these iterators, how to identify these spills and finally how to avoid them.

Hope you liked the series and learned from it. If you have any ideas about the topics we can cover in coming blog posts, please do share with me. I would love to learn and share the knowledge while I learn along with you.

Thank you so much for all your support!

Thank you for reading!

Memory Grant Internals Part 4

This part is the continuation of the series to the post Memory Grant Internals part 3

In our previous part of the series, we have learned about what are these query memory/work space memory and how it works, what happens when the memory allocated to the queries is not sufficient and how the queries have to wait before they get the memory grants. In this part4 of the series, lets learn about how queries will spill to disk when the grants allocated are not enough and how to make sure we address and solve these issues.

If your tables are huge and you are sorting them in your queries, then it will take enormous amount of memory to do the sorting task and it is very hard to allocate memory for such sorts unless you have a huge amount of memory. In such cases, data have to spill to disk. Spilling to disk meaning using the “SQL server scratch pad” which is tempdb to do these sorts and as these spills directly to tempdb on disk, our queries become very slow to run.

As we learned in the first part of the series, memory is needed by these memory consuming iterators: Hash match, Sorts and Exchange iterators. There are two phases for each of these iterators. A build phase and a probe phase. Lets go and learn how these three iterators work in detail:

Hash Match: At the build phase, it will take the outer input and build the hash table. It builds a set of hash buckets. Now what are these hash buckets? Each hash bucket is like a memory location which is a pointer to the linked list of rows. Any rows that will hash to the bucket go into the linked list and be placed there. So, the input side is scanned, buckets are created, rows are hashed and then these rows are placed into the linked list in the respective buckets using a Hash function and during this phase, we have a hash table created. Next, the match operation will start where the inner set will be scanned and the rows on inner set are hashed and go to the respective buckets as well. Then they iterate over the linked list to find the match rows.

How do the hash match, spills to disk?

During the build time, the memory is allocated based on the cardinality estimates and the estimates are based on the input size and the probe. The buckets are created in the memory and we place the rows in those respective buckets. If the grant is exceeded then some of the buckets will be send to the disk. As some of the buckets are already in memory and some in disk, the initial probe of the data using the inner set. The probe rows will be scanned if the row hash to the in memory bucket the match is done for those rows. If the rows match to the on disk bucket, the probe row will be return to the disk along with the outer side bucket. So, because of this we have more disk writes to tempdb at the probe time. This is a build side spill.

There is also probe side spills that can occur at the second phase of the hash match iterator. The buckets that has spilled to disk from the first phase should be now moved to the memory back again and then rebuild the hash table, then the spilled inner probe rows will be read back to the memory to match. In this situation the memory is exceeded again so the buckets have to be rewritten to disk. This is called recursion. So this will be the cycle that the plan will not complete making the plan to bailout. Bailout meaning the plan will use the unoptimized nested loop which cannot complete as it uses inner table multiple times. This plan will use little memory but it can run forever.

Sort: Sort will begin sorting in the memory space that was allocated using the quicksort algorithm (The memory requirement is of at least 200% of the input size). If the memory grant is exceeded even a little bit then the entire intermediate set will spill directly to disk and the sort will continue on the disk completely using the merge sort algorithm. . For the Hash match, if the memory grant is exceeded then few of the buckets will go and spill to tempdb.

Exchange: Exchange iterators will deal with the parallel set of rows from multiple threads. There can be three types of exchanges. Distribute(one stream comes in and multiple streams comes out), repartition(depending on the DOP, n streams can come in and n streams can come out and distributed through a different algorithm) and Gather(DOP n streams comes in and 1 stream comes out). For the distribute and the gather, we have DOP set of buffers but for repartition, there is DOP on each side (DOPx2 sets of buffers). There are two sides for the exchange iterator. Producer and consumer side. These sides of the iterator will communicate with each other using buffers/packets of rows. Each side has DOPx2 per stream sets of buffers.

Gather and repartition iterators can come as merging or order preserving meaning if all the streams that are coming in are ordered then the iterators can output the data in the same order. If there is a data skew and if the huge amount of data is flowing in and flowing in very fast then the merging iterators can not keep these as the same order. In such cases, all the data will be spilled to disk and resorted. This situation is called as intra query parallel deadlock which will extremely slow down the queries.

How to check for these spills?

sys.dm_os_waiting_tasks where all the spills will be waiting on the wait IO_completion which means the slow disks. We can also use the SQL Trace to monitor these spills by capturing the Exchange events Hash warning, sort warning and exchange spill. We can also use the sys.dm_db_[task|session]_space_usage which will have the columns for user object allocations like temp tables and table variables and have internal object allocations column which gives information about anything that spills out to tempdb from the workspace memory.

Lets see how these spills can be monitored with the examples (Credits: Adam Machanic)

Create an Extended event filtering the database and choosing the events for Hash and Sort warnings by running below query:

-- Create Event Session SortandHashWarning
CREATE EVENT SESSION SortandHashWarning ON SERVER
ADD EVENT sqlserver.sort_warning(
ACTION(sqlserver.sql_text,sqlserver.tsql_frame)
WHERE (sqlserver.database_id=(8))), -- replace database_id
ADD EVENT sqlserver.hash_spill_details(
ACTION(sqlserver.sql_text)),
ADD EVENT sqlserver.hash_warning(
ACTION(sqlserver.sql_text))
ADD TARGET package0.event_file(SET filename=N'C:\Spills.xel',max_file_size=(50),max_rollover_files=(2))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO

Enable the event and watch live data by right clicking on the extended event we created and choose watch live data.

Sort warnings:

Run the below query to see if the data spills to disk:

select top(1000) * from 
(
select top (668935)
*
from bigTransactionHistory
) as x
order by 
x.ActualCost desc
go

When you observe, no data has actually spilled to disk. What if we go ahead and change the rows and see if data spills to disk:

select top(1000) * from 
(
select top (668936)
*
from bigTransactionHistory
) as x
order by 
x.ActualCost desc
go

When you observe, we have 8 sort warnings as this query went parallel (DOP 8 plan) so sort warning was fired on each thread. If you check the duration, it is 0.3sec but previous query is 0.2seconds. There is only some difference in the duration because we only bought one row extra when compared with previous query. By the time the sort spilled to disk, most of the data is already sorted in memory, but if the getting lot and lot of data need to be spilled then the duration will be more as more data needs to be sorted in disk.

Hash spill warnings:

Lets look at the Hash spill with an example. Run the below query and watch live data in extended events.

select count(*) from 
(
select top (3800000) *
from bigTransactionHistory
) as bth
INNER HASH JOIN
(
Select top(10000)
*
from bigTransactionHistory
) as bth1 on
bth.productid=bth1.ProductID
go

This query did not spill to disk. Now lets increase the number of row we get and see the event information.

Now, you can see the hash warnings and the duration is 0.84sec. With out the spill, the duration was 0.84sec. There is no difference in the duration without the spill and with the spill. Due to the Grace hash algorithm used by the hash match iterator, we only spills some of the buckets to the disk. Hash match spills much less unless it hits the bailout stage. Did you observe, we have only 3 threads spilling to disk for the hash warning but not all 8 threads. That is because if one thread have more data then the other thread, that thread might not spill but the threads having more data can spill. In this scenario, it spilled three threads. This happens when you have skew in your data.

Lets go ahead and see that information in the execution plan for the same query by running it again.

When you observe the input number of rows passing through the Hash match iterator, by looking at the properties (F4), the number of rows for each thread is not even, the last three threads are having more rows compared to other threads. These three threads have spilled to disk and that is the reason we have seen only three hash match warnings in the event information.

Exchange spill:

Lets go ahead and add two more events to capture the exchange spill: Exchange_spill event and Lock_deadlock_chain event.

Now lets run a query that will have an exchange spill (Credits: Erik Darling)

Creating the table first

TRUNCATE TABLE DEADLOCK;
INSERT INTO DEADLOCK WITH (TABLOCK)
SELECT
  (RN - 1) / 12500
, REPLICATE('Z', 100)
FROM (
       SELECT TOP (50000) ROW_NUMBER()
        OVER (ORDER BY (SELECT NULL)) RN
       FROM master..spt_values t1
       CROSS JOIN master..spt_values t2
       CROSS JOIN master..spt_values t3
) t
OPTION (MAXDOP 1);
UPDATE STATISTICS DEADLOCK CI__DEADLOCK WITH FULLSCAN;

Now, run the below query which can give the expange spills:

SELECT t1.ID
FROM DEADLOCK t1
WHERE EXISTS (
       SELECT 1
       FROM DEADLOCK t2
       WHERE t1.ID = t2.ID
)
ORDER BY t1.ID
OPTION (QUERYTRACEON 8649, MERGE JOIN, MAXDOP 2);

Due to the data coming to the final parallelism iterator and the merging iterator as you can see the order by at the last parallelism iterator. If we have these inter query parallel deadlocks, we can go ahead and see if we have any of these merging iterators in our execution plan.

Summary:

In this part 4 of the series, we have learned about types of spills that can happen from these memory consuming iterators when the grants allocated are not enough and how to make sure we address these issues. In the next coming part, we will learn about how to solve these issues.

Thanks for reading!

Memory Grant Internals Part 3

This part is the continuation of the series to the post Memory Grant Internals part 2

We have 1gb of memory. 1GB is 128000 pages. Each page is 8kb. Multiplying 8×128000 gives 1GB. There are two default resource semaphore pools. Query memory objects (default) available and small query memory objects (default) available from dbcc memorystatus. when we add these both values 87823+4619=92442 which divided by 128000 gives 0.72 which is the percentage of the total available buffer pool.

Lets go ahead and run a query and check the execution plan for the memory grant

select top (1000) * from
(
select top (10000000)
* from 
[Sales].[SalesOrderHeaderEnlarged]) as a
order by
TotalDue desc
OPTION (MAXDOP 1)
GO

When you observe the memory grant, its 183MB. What if we go and get more rows this time

select top (1000) * from
(
select top (15000000)
* from 
[Sales].[SalesOrderHeaderEnlarged]) as a
order by
TotalDue desc
OPTION (MAXDOP 1)
GO

If you run the query select * from sys.dm_exec_query_memory_grants and see the ideal memory which sql server wanted to allocate for this query, we see 1022120kb (1GB). Depending on the 1GB of memory set as Max server memory, the requested_memory_kb is 183496kb (183MB) and used_memory_kb is 183496kb (183MB)

The memory grant for the above query is 183496/8=22937 pages of memory grant. The total available grant for the queries is 92442. We can run the above query 4 times to hit the total memory grant (22937*4=91748). We can use the SQLQueryStress tool to run the query and see the query memory grants. Let us run the query with 100 number of iterations and 10 number of threads.

While the query runs, check the memory grants by running select * from sys.dm_exec_query_memory_grants and see the queries running.

Three queries are running and others are waiting for the memory grants which is represented as the grant_time as NULL as they are waiting for their grants to be released so they can run. Only three queries are running as each query requires 150% of the requested memory grant. Once these queries completes, the other queries gets the memory grants.

As the workload groups has three importance queries. Low queries (0-4), medium queries (5-9) and high queries (10-14). By default, the default workload group has importance of medium queries.

Adam Machanic has explained with some awesome query examples of how the high cost query with low memory grants and high cost query that requires high memory grant works. Lets go ahead and look at these examples. (Credits: Adam Machanic)

Lets go ahead and run the below query which have high query cost but low memory. Run the sqlquerystress at the same time

use Adventureworks2016
go
select top (5000)
* from
(select top(1000)
p.*, b.Actualcost
from dbo.bigProduct as p
CROSS APPLY
( Select count(*) as actualcost
from dbo.bigTransactionHistory as b
where
b.ActualCost between p.productid-1000 and p.productid+1000
)as b
)as x
ORDER BY name
OPTION (MAXDOP 1)
GO

When you check the actual execution plan, the estimated cost is very high, 11479.3 sec to run this query.

But when you check the input to the sort, it only have estimated data size as 263kb which should be very less memory.

While you run the above query, run the sqlquerystress again to see the running queries and queues

When you do the select *from sys.dm_exec_query_memory_grants and see the spid number 56 is waiting and shows the grant_time as null which means the query is waiting. This query is waiting for the 1.1MB memory though the requested_memory is very less. This is due to the high query cost. Look at the queue id is shown as 9. So, the low cost queries which have queue id of 8 should all run through before spid 56 can run. The timeout in seconds is 86400 which is long time for the timeout to occur. SQL Server prioritize low cost queries to run first but as this query is high in cost and low in memory, the query still have to wait until all the low cost queries run.

Now if we have lot and lot of queries having high query cost but only need some memory to run, how do we resolve this issue? We can solve this issue by altering the default workload group and set the request_memory_grant_timeout_sec. Lets go ahead and alter the default workload group to set the timeout to 10 sec.

Alter workload group [default] with
(
--default=0
request_memory_grant_timeout_sec=10
)
go

Alter resource governor reconfigure
go

Now run the same query again with the other queries running with the SQLQueryStress and look the results by running select *from sys.dm_exec_query_memory_grants. When you observe, the timeout_sec on all queries has changed to 10sec. Now this query with spid 56 has to only wait 10sec to get the memory granted to run the query.

After 10 seconds, if you run select *from sys.dm_exec_query_memory_grants again, now the query should be running and the grant_time is no more null.

We can also view these waits for the memory grants by viewing the sys.dm_os_waiting_tasks. You can see all the spids waiting for their resource semaphore grants and waiting to run.

select * from sys.dm_os_waiting_tasks
where wait_type='resource_semaphore'

Lets run a high cost and high memory grant query and check the execution plan for the query cost (129778 sec) and memory grant (566MB). Remember, we have the total available memory only 1gb. So this query needs the entire available memory to run.

select top(5000)
* from
(select top(1000)
p.*, b.thecount
from
(select top(1000000)
productid from dbo.bigTransactionHistory
order by actualcost) as p
CROSS APPLY
(select count(*) as thecount
from dbo.bigTransactionHistory as b
where b.ActualCost between p.productid-1000 and p.productid +1000
)as b
order by b.thecount desc
)as x
order by ProductID
OPTION (MAXDOP 1)
GO

Lets execute this query along with the other queries running through the SQLQuerystress. When you check the sys.dm_os_waiting_tasks requested memory grant, its 185MB but the query need 150% of requested memory grant to get the memory grant and to be run.

Now, when I go and run the sys.dm_os_waiting_tasks again and check the granted memory, now the query has the granted memory because we hit the timeout of 10sec and as it completed this timeout, it received the granted memory of 640KB to start running the query.

So, once the query time out occurs (here in the query timeout meaning not timeout of the query but instead it waits for the query timeout seconds which is 10sec in this case and then go ahead and grant the memory as required memory to run the query. So, once the query needs to run after the timeout period, it will have to do one of these three things. If there is enough memory to run, it will run. If there is not enough memory to run even for the minimum grant we will get an error. If we have memory for the minimum grant but not enough memory for full grant, then it will scale all the way back and just grant that minimum memory to run. So this query waits until the 10sec timeout and then run with the required memory grant which is the minimum required grant to start the query to run (640kb).

Summary:

In this part 3 of the memory grant internal series, we have seen about how the query memory/work space memory works. We have also learned what will happen if we do not have enough memory. That’s when things can get complicated and now the sorts and hashes have to spill to disk in tempdb.

In the next part of the series, we will learn about how queries spill to disk when the query memory is not granted as required for the query to sort in memory. When the query spill to disk, queries runs very slower. Lets learn about these in the next part of the series.

Thank you for reading!

Memory Grant Internals Part 2

This post is the continuation of the series to the post Memory Grant Internals part 1

Once the query optimizer finish its job, the query processor will start its job. Query processor will first looks for the importance of the query based on the work load group. So, what is this workload group? To know what it is, we need to Dive deep into Resource Governor. As we are dedicating this series to dive deep into the memory Grant internals, lets focus this post on Resource governor which handles the memory grants for our queries.

Resource Governor

Resource Governor is introduced in SQL Server 2008. Resource governor job is to control and manage the consumption of the CPU, Memory and IOPS resources used by the incoming requests. Resource governor will ensure that one run away query is not consuming the entire memory.

Resource Governor

There are three different components in the resource governor.

  1. Resource Pool: Small percentage of CPU, Memory and IOPS resources are collectively called as the resource pool.
  2. Workload Group: You can define the connections based on the priority is known as Workload group.
  3. Classifier: Classifier is a User defined function created in the master database in which we mention the predefined criteria to classify the incoming requests to respective workload groups is known as classifier. Classifier function can classify the incoming connections by using the application name, database name, login name etc.;

If your SQL server handles many database applications connecting to single database server instance, requests made from all the applications will compete to get enough resources to process the requests which will cause some serious performance issues. With the help of resource governor, we can distribute the assigned resources to specific workloads based up on the low, medium and high priorities. SQL Server can create up to 64 user defined resource pools.

When we install SQL Server, two pools are automatically generated. The Internal pool and Default pool. Internal pool is used for the internal background processes like DAC connections. Internal pool cannot be altered. Internal pool will be the priority when compared with the user defined resource pool. Default pool will handle all user connections that are not sent to the respective user-defined resource pool.

So, when a request is sent to the SQL Server, that session will be diverted to the associated workload group. This workload group will use the resources provided in the respective resource pool.

Now, lets get to the query processor. First the query processor will categorize the query we are sending to the SQL Server based on the workload group, then next it looks at the estimated cost and the size of the memory grant and categorize the query will be placed in the set of queues and these queues are known as Resource semaphores. These queues force queries to wait to get into the semaphore so they can get some memory to run. Resource semaphore is a gateway where SQL queries have to wait until they get the memory to even start processing the query. If SQL Server ran out of memory and queries waiting for the memory to be released to be processed, during this time we see resource semaphore wait types.

Resource pool have two primary resource semaphores. A small and large semaphores. Both have individual memory grants. Small semaphores are for queries with cost less than 3 and grant size less than 5MB. These queries use smaller pool of memory. The large semaphores are for larger queries and use different set of queues for large memory grant.

Each of these semaphores three sets of queues. They have low, medium and high priority sets. The high priority set queries will get importance and first get the memory grants and low priority queries have to wait until they get the memory. The large semaphore have five queues per set. Here the priority is different. Low cost queries gets the higher priority. Low cost queries will run faster than the high cost queries and that is the reason SQL server will prioritize the low cost queries first because low cost queries run pretty quick when compared with high cost queries which need more memory grants.

Queries doesn’t start unless available memory is 150% of requested query memory because sql server will always allocate a little bit of additional buffer memory. Lets take the 100MB query which will get 150MB with additional buffer. What if this query only get 100MB and doesn’t get that additional buffer in the beginning to run the query? In that case, the query will wait until the 25x the query cost which is in seconds. When the timeout occurs and if the exact grant (100MB) is available without the buffer amount then the query starts to run this time. In other scenario, even after waiting for so long and after the timeout, the exact memory (100MB) without the memory grant is not available and less than 100MB is available, in that case the grant can be decreased to the minimum required grant. What if even the minimum required grant is not available, in that case we will receive an error 8645. We can configure the timeout by using the resource governor.

Configuration options for Resource Governor

sp_configure: You can set the options at the server level. You have the option Max server memory (MB). The query memory will be taken out of the buffer pool which is within the max server memory. We also have the Server level default query wait time which is in seconds. Zero is the default value (25x cost of the cost of the query). We also have server level option min_memory_per_query (KB) which is 1MB by default.

Resource pool: There are two main settings. Min_memory_percent and Max_memory_percent. These are the settings to determine how much buffer pool grant can be used by the resource pool.

Workload Group: Within each workload group, we can have the importance level. We have the option request_memory_grant_timeout_sec where we can set the timeout specifically for the workload group. We have the option request_max_memory_grant_percent where we can set the memory grant percent of the resource pool grant that this workload group can take. When you have several different types of workloads on the servers, we can classify and give the workloads a very specific memory utilization.

We can check the memory status by using these commands:

  1. DBCC MEMORYSTATUS (looking at the current and max values of Small Query memory objects, Query memory objects) with give you the information about the maximum workspace/query memory. These two are the default resource semaphore pools.
  2. DMV’s like sys.dm_exec_query_resource_semaphores (information about all the resource semaphores and queues), sys.dm_exec_query_memory_grants provides lot more information to check when you have memory issues (provides one row each for query that has the memory consuming iterators. Gives the information on the whether the query is waiting for memory or granted the memory, how much memory, required memory and ideal memory information). Other DMV we can use is sys.dm_os_waiting_tasks gives the rows for the requests that are waiting for memory.

Lets start doing a hands on lab:

Let me set my max server memory to 1GB so we it will be even for us to play with

sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'max server memory', 1000;
GO
RECONFIGURE;
GO

Run a simple query by enabling the actual execution plan to see how the memory grants are allocated for a query:

USE [Adventureworks2016]
GO
Select top (1000)
*
from
[Production].[TransactionHistoryArchive]
order by ActualCost
OPTION (MAXDOP 1)

We are choosing the option MAXDOP 1 because we want to run this in a single thread and if we use parallelism, then the memory grants are allocated for each thread. To be simple, we used the maxdop value as 1. Now when you observe the execution plan and see the memory grant on the select, you see the value 10776kb

Now see the estimated data size by choosing the arrow pointing to the sort operator. The estimated data size is 4707kb. The input size for the sort is 200% which will be double of the estimated data size. multiply, 4707kbx2 =9414kb plus additional buffer to make the memory grant 10776kb. That is how the memory grants are calculated based of the estimated data size.

Now, lets go ahead and see how memory is shared between the iterators. Lets run this below query and see the execution plan.

use adventureworks2016
go
select top(20)
* from
(
select top (50)
* from
(
select top (100)
* from
[Production].[TransactionHistoryArchive]
order by ActualCost desc
)a
order by ActualCost
)b
order by ActualCost desc
OPTION (MAXDOP 1)
go

Press F4 to see the properties of first sort and look into the memory fractions for this sort. The memory fractions input shows as 1 which means the input phase took all the granted memory available and once the build phase is completed, half the memory is available which is represented as 0.5 in the memory fraction output. This memory available will be send to the next iterator.

Now, press F4 on the next sort iterator and check the memory fraction values. The total available memory now is 0.5 as it passed from the previous sort. During the build phase, it used 0.5 and after the build it has same 0.5 available. Now this memory will be passed to the next sort iterator.

Lets go and see for the third sort iterator, press F4 to see the values. It will use the same memory from the previous iterator to build phase 0.5 and after the build phase it released the total memory granted to this query.

This is how we will analyze the memory utilization of a given plan is to look for the memory grant for the final iterator and look for the memory fractions down the stream to know how much each iterator is trying to use.

Summary:

In this series of memory grant Internals part 2, we have learned about resource governor, components we have in resource governor, configuration options for the resource governor, how to monitor the memory status and how we can see the memory being used by each iterator and share the memory between them before releasing the memory.

In the Part 3 of the series, we are going to learn how to monitor the memory grants by using the DMV’s with examples and much more…

Thank you for reading!

Memory Grant Internals Part 1

Memory!! Our servers need memory to store our data, execution plans, storing temporary data and to do some system operations as well. Memory is needed for Query compilation and to stored the plan in the cache once compiled and for processing our queries for sort and hash operations.

Now, you may have a question. Why should I care about knowing the Internals of memory Grants?

Memory is like a set of toys we give to few kids in the party. Everything is well and fine as they generously share the toys with each other. Meanwhile, many kids joined the party and now everyone wants to play with the same set of toys. You know what will happen, screaming and fighting everywhere. No kid is happy until they get a chance to play with set of toys. What do we say to kids fighting for the toys?? “We need to share…”

Same thing happens with memory. Every query needs a memory grant (Query memory Grant) that needs to be allocated before going to the execution phase. Memory grants allocated are based on the hash, sorts operations and based on the cardinality estimates.

Too much memory grant allocated to a query than it requires causes concurrency issues, too little memory on the other hand causes pages spilling to disk (tempdb). Too much is bad and too little is bad too. We need to make sure we are sharing the memory the right way!

By the end of this series Memory grant internals, you will be able to understand how you can effectively manage the memory grants by using the Resource governor. We will also learn about the memory grant feedback feature(SQL Server 2019) and how it is useful in allocating memory grants preventing hash and sort operations spilling to disk.

Introduction to Memory Grants

SQL Server have different types of components (Plan Cache, Buffer pool and other memory components) in the memory area and requires memory for many things. We also require memory to process our queries. Each operator in the execution plan requires memory to process the rows from one operator to the other operator. Some of the iterators need to be buffered as the rows come into them before sending them to the next iterator in the execution plan. These iterators needs more memory. These are called memory consuming iterators. The grants that we given to these memory consuming iterators are called Query memory grants.

There are three main memory consuming iterators:

  1. Hash Match Iterator: Takes the input rows and build the hash table before the hash table can be probed.
  2. Sort Iterator: Takes the rows and sort in a specific order required.
  3. Exchange Iterator: Takes the sets of rows coming in and has to exchange in the buffers to send to the next iterator.

Memory is not free! So, from where do these memory grants come from?

There is a component known as Fixed memory in the memory architecture of SQL server. This memory is “fixed” because it cannot be expanded when it is in use. This memory holds the memory grant connections we use for hashing and sorting for our queries. Like the other caches like plan cache and the buffer pool which can easily expand and shrink when there is a memory pressure, fixed memory cannot expand and shrink when in use. Once a memory is granted to a query for its execution, memory grant cannot be released until the query completes whether the granted memory is used by the query or not.

Fixed memory gets the memory grants from the buffer pool and the plan cache. When more and more queries are requesting for memory grants, sql server will go ahead and use the buffer pool memory. Maximum utilization is approximately 75% of total buffer pool memory.

Every query that sends to the SQL Server will be first go to the pre-optimization stage which is parsing and binding. Then the optimizer comes up with an ideal grant. Ideal grant is the total amount of memory the query needs to process all the operations (memory consuming iterators) to do in memory without having to spill to disk. If the ideal grant is very high then the memory is leveled down based up on the resource pool settings.

We are going to learn about the resource governor and resource pool in coming parts of the series.

How do compiler gets the Ideal grant?

  1. Statistics: Optimizer will trust the statistics information to estimate the number of rows coming out of the iterators inside the execution plan. Optimizer looks for the row count and row size estimates.
  2. Hash and Sort memory consuming Iterators: Based up on the hashes and sorts in the execution plan and depending up on the number of row count and row size of input sets. These iterations can actually share the memory grants between them which is known as memory fractions.

Sharing the memory:

Memory consuming iterators (hash and sort) have two phases. Build phase and Utilization phase. For the sort operator, build is to taking the rows and sorting them. Once they are sorted they are scanned. For the sort build phase, the memory is about 200% of the input data size, which is huge. After the build phase, the memory drops to 100%. For the hash iterator, build is to take the rows and build a hash table. Then the hash table will be probed. After the build phases, once the memory is released at the operator level, the next operator downstream can use the same memory.

Summary:

In this series, we have learned

  1. What are memory grants?
  2. What are the main memory grant iterators?
  3. From where do these memory grants come from?
  4. How the compiler gets the ideal grant for any query?
  5. How hash and sort operators share the memory?

In the part 2 of the series, we are going to learn about how resource governor take a part in managing these memory grants.

Thank you for reading!

Stay tuned for the second part of the series….

T-SQL Tuesday #134

T-SQL Tuesday #134

With the Invitation of James McGillivray for this month T-SQL Tuesday about taking a break when you’re stuck at home, I would like share some tips of how I manage stress by taking breaks.

It is a challenging time for each one of us during this pandemic. It is very important to be mentally strong during these hard times. Below are some of the tips I followed to take a break when we are stuck in home

  1. Reading moral stories to kids. Kids really enjoy me reading for them. As I explain the moral values to them, I learn along with them. This will instantly calm me down and make me think the wider perspective of life.
  2. I love arts and crafts. I love shopping for the material to make any new crafts from the ideas I have since long time. Putting something in action and seeing the craft coming out the way you saw in your mind gives so much satisfaction. Below picture is one of the crafts I made “3D wall Garden”.

3. I listen to David Goggins (Former navy seal) podcasts. He explains how a person can be mentally strong and how much a human being can accomplish just by changing the way we look at life. I started reading his book “Can’t Hurt Me”. This book has already changed me. I learned how to be mentally strong during the hardest times of life.

4. Watching thriller movies. I love watching thriller movies as they take me to a complete different world for few hours. Its a true refreshment.

5. Spending some time myself with a cup of coffee. No distractions with any social media, mobile phones or TV. Solely, its a self time. This is the best time to analyze how our mind wanders around. Controlling mind chattering is very important for mental health. Watching the thoughts pass through the mind freely and with mindfulness is like a meditation.

These are some of the tips I follow to take a break, I would love to see the tips from other participants for this month T-SQL Tuesday!

T-SQL Tuesday #133: What (Else) Have I Learned from Presenting?

Thank you Lisa Griffin for the invitation to December month of T-SQL Tuesday.

After I started presenting, I learned many technical concepts but here I would like to quickly mention couple of them:

  1. Memory grant connections are stored in a memory component in the SQL Server which is called as Fixed memory. As the fixed memory cannot be expanded or shrinked when there is a memory pressure, SQL server requests the other components of memory like for example, evict the plans out of the plan cache and dirty pages out of the buffer pool to release the memory as memory grants for executing the queries. These memory grants are allocated before the query goes to the execution phase. Memory grants for any query are allocated based on the hashes, sorts in the execution plan and Cardinality estimates. When there are many queries requesting more memory and when the Fixed memory is already full, these other memory components releases the memory as there is an internal pressure with in the SQL Server. Once the memory grant is allocated to the query, the memory will only be released once the query execution completes. Resource Semaphore is responsible for satisfying memory grant requests while keeping overall memory grant usages within the server limit.
  2. Redundant joins between the tables will be only removed by the SQL Server automatically on the background only if the referential integrity is maintained by the foreign key constraint defined between both the tables. If there is no Constraint defined, then as SQL Server doesn’t know about the relationship between two tables, SQL Server will perform a join scanning one table to find matching values from other table.
  3. SQL Server needs a NOEXPAND hint to create a query plan that uses an indexed view. When NOEXPAND is specified, automatic stats are created on index view same as it happens with the ordinary tables.

I would also like to hear from other community members of what else they have learned from presenting!