SQL Server Index Myth 6: Indexes Always Output in the Same Order


One of the more common myths that pertain to indexes is that they guarantee the output order of results from queries. This is not correct. As previously described in this book, the purpose of indexes is to provide an efficient access path to the data. That purpose does not guarantee the order in which the data will be accessed. The trouble with this myth is that, oftentimes, SQL Server will appear to maintain order when queries are executed under certain conditions, but when those conditions change, the execution plans change, and the results are returned in the order that the data is processed versus the order that the end user might desire.


To explore this myth, you’ll first look at how conditions can change on a query that is using clustered index. In Listing 15, there is a single query, repeated twice, for the Sales.SalesOrderHeader and Sales.SalesOrderDetail tables that is performing a simple aggregation. This is something that might appear in many types of use cases for SQL Server.

Listing 15. Unordered Results with Clustered Index


USE AdventureWorks2014
GO

SELECT soh.SalesOrderID, COUNT(*) AS DetailRows
FROM Sales.SalesOrderHeader soh
 INNER JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
 GROUP BY soh.SalesOrderID;
GO

DBCC FREEPROCCACHE
DBCC SETCPUWEIGHT(1000)
GO

SELECT soh.SalesOrderID, COUNT(*) AS DetailRows
FROM Sales.SalesOrderHeader soh
 INNER JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
 GROUP BY soh.SalesOrderID;
GO

DBCC FREEPROCCACHE
DBCC SETCPUWEIGHT(1)
GO

The conditions in which the two queries execute vary a bit. The first query is running under the standard SQL Server cost model and generates an execution that performs a couple index scans and a stream aggregation to return the results, shown in Figure 9. The results from the query, provided in Figure 10, provide support that SQL Server will return data in the desired output, provided that the SaleOrderID column is the column that the users wants sorted.

execplan
Figure 9. Default aggregation execution plan.

 

execplanres
Fig 10 . results from default aggregation execution plan

But what happens if the conditions on the SQL Server change but the business rules do not? The second query executed in Listing 15 is the same query, but with a change in conditions. For this example, the DBCC command SETCPUWEIGHT is leveraged to change the cost of the execution plan. The change in cost results in a parallel execution plan being created and executed, shown in Figure 11. The effect of the new plan is a change in the results of the query, provided in Figure 12. While the results appear to still be ordered, the logic of the query hasn’t change, but the first number in both results is different. At some point in the second result set, those rows not appearing at the start of results appear. The danger in this is that the results look sorted when a validation of them proves that they are not.

parallelism.GIF
Fig 11. Aggregation execution plan with parallelism
defaggplan
Figure 12. Aggregation execution plan with parallelism

Warning  Do not use DBCC SETCPUWEIGHT in production code to control parallelism or for any other reason. This DBCC command is strictly available to control environmental variables within SQL Server to test and validate execution plans.


 The other condition to consider is when business rules change for a query. For instance, maybe a set of results wasn’t originally filtered, but after a change to the application, the query may change to using a different set of indexes. This can result in a change in the order of the results, such as when a query changes from using a clustered index to a nonclustered index.
To demonstrate this change in behavior, execute the code in Listing 16. This code runs two queries. Both of the queries return SalesOrderID, CustomerID, and Status. For the purposes of the example, the business rule dictates that the results must be sorted by SalesOrderID. In this case, the results from the first query are sorted as the business rule state, shown at the top of Figure 13. But in the second query, when the logic changes to request fewer rows by adding a filter, the results are no longer ordered, shown at the bottom of Figure 13. The cause of the change comes from a change in the indexes that SQL Server is using to execute the query. The change in indexes drives the results to be processed, and ordered, in the manner in which those indexes sort the data.

Listing 16. Unordered Results with Nonclustered Index

USE AdventureWorks2014
GO

SELECT SalesOrderID, CustomerID, Status
FROM Sales.SalesOrderHeader soh
GO

SELECT SalesOrderID, CustomerID, Status
FROM Sales.SalesOrderHeader soh
WHERE CustomerID IN (11020, 11021, 11022)
GO
filteroder
Figure 13. Query results demonstrating effect of filtering on order

In these examples, you looked at just a couple of the conditions that can change when it comes to how SQL Server will stream the results from a query. While an index might provide results from the query in the order desired this time, there is no guarantee that this will not change. Don’t rely on indexes to enforce ordering. Don’t rely on being clever to get the results ordered as desired. Rely on ORDER BY statements to get the results ordered as needed.

For all your application development needs, visit www.verbat.com for a fiscally conscious proposal that meets your needs ( So I can keep this blog going as well!!!!)

Alternatively click through the link   if you found this article interesting. (This will help the companies Search engine rankings)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: