Decoding the Physical Join Operators: Part 1

I specifically thought of writing about this topic since long time. I always had trouble understanding clearly about Nested loop joins, Merge joins work. Once I learned about how they actually work, I planned to write a blog post about the same so you can know about how these joins work as well. By the end of this post, you will have a nice smile on your face as you feel this is so easy once we understand the concept behind these joins. This is a 2 part series.

Types of Physical Operators

Visualizing Nested Loops Joins And Understanding Their Implications

Nested Loop Joins:

For example, in the diagram you see the representation of Table 1 (outer table) and Table 2 (Inner table). For every value in the first table (outer table), SQL Server will query the second table (Inner table) looking for the matching values. We will read every value in the second table once for every value in the first table. In the below example, once the 1st value containing the orange color is compared with each value in the table 2. Once the complete table 2 values are compared, then SQL Server will loop into the next value on the table 1 to compare each value on the table 2 again and this process is looped over until each value from table 1 is compared to each and every value of table 2 to get the matched values as output of this operator. This is a lot of work. This takes so much CPU.

If the inner table is presorted by having an index, then this will save time because when we iterate values over the inner table, SQL Server can now seek to the rows that we need to compare against the outer table instead of comparing to every row in second table. This can reduce the number of comparisons that needs to be compared which can actually improve the performance.

Now, how do we know if these nested loops are appropriate in our plans or not? You can view the actual execution plan to see on how many rows does the nested loops iterator is getting out. If the number of rows that are coming out of that iterator are only few, than that is fine for the nested loop join. The other cause might be the estimates. Estimates might be off for that data and instead of using the better joins for that data, it might be using the nested loop join. We also might be missing the index which can pre sort the data that is going into the join. This may also be the reason for choosing the nested loop join.

Merge Join:

What if we sorted the data, now we have the data sorted in both the tables. Look at the diagram below for reference. As we have both the tables in the sorted order and we have to join them, SQL Server will first take the first value and compare with the first value in second table which also has the same value 1 as the first table first value 1. Then it goes to the second value in the second table which is 2 and compare with the first value 1 in first table which doesn’t match. Because the second table is sorted, we know that there is no other values that matches in the table 2 with the value 1 from the first table. It will check with the next value 2 from the table 2 which doesn’t match with the value 1 from table 1. value 2 is greater than value 1 so SQL Server now knows that there are no more matching rows in the table 2 and SQL Server will stop processing now and goes back to the next value in table 1 which is value 2 and check with the table 2 value 1. This time SQL Server will compare with the last value we had a join on which is value 1 in the table 2. Value 2 from the table 1 doesn’t match the value 1 from the table 2 so it goes to the second value 2 from the table 2 find the match and goes to the next value 3 from the table 2 and finds value 3 is greater than value 2 and stops the comparison as it knows there will be no matching values next in the table as the data is sorted in the order. It goes to the next value 3 from table 1 and compare from the last value we had join on which is value 2 from the table 2 and find the matching value 3 from the table 2, goes to the next value 4 from the table 2 and found the mismatch and so it knows there are no more matching rows and stops processing and then goes to the value 4 from the table 1 and compare with the last value we had join on and do the same processing until the comparison completes for each value in table 1. This is done as the SQL Server walks and compares both the tables bit by bit. This is the basis of a Merge Join.

In the Merge join, we sorted the table on the join columns and walked down the columns bit by bit comparing the rows. Because the rows are sorted, SQL Server can stop the processing once it founds the matching value in the second table which is greater than the value of the current row in the first table.

Merge join is the fastest physical join operator in the SQL Server. Merge join is so efficient because it can go through each input a single time and it doesn’t have to loop back and do more operations like the nested loop join. When we have many to many merge join, in that case SQL Server cannot reiterate the number of rows of inputs by doing many to many join. It instead creates a work table in tempdb and loops that over and over again to process those many to many joins. If the first table has duplicate values, in that case SQL server will create a work table and do the comparison just like the regular merge join but when it finds the duplicate values in the second table, it writes them into work table in tempdb and does the comparisons there. If it also finds the duplicate values on the first table, SQL Server will go ahead and compares the first table values to already stored values in the work table. Once the iterator moves past those duplicated values, the work table gets cleared out and the merge join continues until all the values gets compared. If you would like to know in depth about how the work tables work in detail, please refer to this wonderful article from sqlserverfast.com

Merge join is not a bad thing and it may be efficient already in your execution plan. What you have to observe when you see the merge joins and performance slow on that plan is to focus on the upstream operations that are going into the merge join. Whether the data is presorted as you already have an index or whether the data is presorted in SQL Server own way then in that case, you can simply check if you can just add that missing column in the index and place in the last key column in the index or use a different join algorithm will be better. The other scenario might be you have lots of duplicate values in your data. If that is the case SQL Server will be using the work tables to handle how the duplicate values can be joined on. So, if you see the duplicate values or using tempdb, then finding the better options will be good.

Combining the tables with the merge join is faster than the nested loop join but we need to sort the data first. You know that the sorting is slow. So, how can we do this better. That’s when the Hash join comes in.

Hash Match:

For every value in the table 1, we can apply the hash function to it and build a Hash table on it. Each and every value will be now in the hash table that has been built. Once that has been done, we can go to the second table and apply the same hash function to each value in table 2 and lookup probe for the hash table 1 we built for matching values and we have a join. We repeat the process for all the values in the second table. This is the Hash join.

There are two phases that the Hash join happens. The first phase is Build phase where SQL Server will build the in-memory hash table from one of the two inputs. Hashes are calculated based on the join keys on the input data and the whole rows are stored in the hash table under the hash buckets. Mostly we have one row per the hash bucket except when we have multiple rows with duplicate join keys or they is a hash collision (total different join keys might get the same hash which is a rare scenario). Once the hash table is build, SQL Server will start the probe phase where SQL Server will calculate the join key hash for each row in the second input and checks to see if it exists in the hash table builds in the first build phase. If it finds the match it will goes to the next to see if the join key themselves actually match because sometimes we might have the hash collision issues.

Hash joins is not the fastest type of the join because it needs to use the hash function which uses more CPU to calculate the hashes of the join keys but this join is really useful to join really large tables and it uses Tempdb to create these hash tables if memory is not sufficient to build them. If SQL Server cannot build all the hashes of the hash table in the memory because it is very large, then the hash can actually spill to disk and use the tempdb. In this scanario, SQL Server can put as many buckets as possible in the memory and the ones it cant keep in memory can be stored in tempdb. In the proble phase, SQL Server will join any rows that are in the memory that are in-mmeory hash table and any hashes that it cannot finds the match in the hash table, it writes to tempdb until all the rows from the probe input has been processed, it loads everything related to from the tempdb into the memory and continues to do the comparisons there. SQL Server can actually stage some of the data into the memory during the join phases and loading it back into the memory once some of the memory gets cleared up.

What can we do if we see the Hash match in our execution plans? We have to first see if this operator is useful in the plan because this is a blocking operator meaning until the build and the probe phases are completed all the downstream operators have to wait in the execution plan for processing. We know that Hash join operator can actually join any data sets but when it actually spills to disk, it can cause the performance issues. That’s when we need to peek in and look at the memory grants that has been allocated is not sufficient because of the incorrect estimates. The memory grant that has been actually allocated to the query might not be sufficient to build the hash table in the memory and so spilling to disk. In this scenario, you can check if the statistics are up to date and check to see if adding that one column to the index might help choose a different optimal physical operator to process the query effectively.

Summary:

In this part 1, we have learned about how the physical join operators work, when they can be efficient and when they are not. Where and what to look at when they are not efficient to make our queries faster. In the second part of the series, we are going to look how the indexes help these physical operators go faster.

2 thoughts on “Decoding the Physical Join Operators: Part 1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s