Indexing Tool : Missing Index Dynamic Management Objects (DMO)


Indexing Tools

When it comes to indexing, Microsoft has two tools built into SQL Server that can be used to help identify indexes that can improve database performance. These are the missing index dynamic management objects (DMOs) and the Database Engine Tuning Advisor (DTA- Not covered in this article). Both tools are useful to assist with indexing databases and can provide valuable input when working on tuning a database.

Missing Index DMOs

The missing index DMOs are a set of management objects that provide feedback from the query optimizer. When the query optimizer compiles an execution plan, it can identify when materializing statistics into a physical index would improve performance. In these situations, the query optimizer will compile the results and store the information in the missing index DMOs.

There are a couple of benefits that the missing index DMOs provide. First, the missing index information is collected from the query optimizer without any action required on your part. Unlike Extended Events and other performance monitoring tools, you don’t need to configure and enable it in order for information to be collected. The other thing to consider is that the missing index information is based on actual activity occurring on the SQL Server instance. The index suggestions aren’t based on a test load you believe might happen in production but rather on the production load itself. As the usage patterns of the data in a database change, so too will the missing index recommendations.

Despite the benefits provided by the missing index DMOs, you must take into account a few considerations when using them. The limitations on the missing index DMOs can be summarized into the following categories:

  • Size of queue
  • Depth of analysis
  • Accuracy
  • Type of indexes

The size of the queue for missing indexes is one of the limitations that is easy to miss. Regardless of the number of databases on the SQL Server instance, there can be no more than 500 missing index groups. Once 500 missing index groups have been identified, the query optimizer will stop reporting new missing index suggestions. It will not make any determinations to decide whether a new possible missing index is of better quality than items already reported; the information is just not collected.

Note  As with other dynamic management objects, the information within the missing index DMOs resets when SQL Server restarts and gets dropped for a database whenever the database is brought offline.

When considering the information in missing indexes, the depth of the analysis is a limitation that needs to be considered whenever you are reviewing the suggestions. The query optimizer considers only the current plan and whether the missing index would benefit the execution plan. Sometimes, adding the missing index to the database will result in a new plan with a new missing index suggestion. These suggestions are only a first pass at improving performance on an execution plan. The other half of this limitation is that the missing index details don’t include tests to determine whether the order of the columns in the missing index suggestion is optimal. When looking at missing index suggestions, it will be necessary to test in order to determine the proper column order.

The third limitation of the missing index suggestion is the accuracy of the information returned with the statistics. There are two things that need to be considered with this limitation. First, when the queries use inequality predicates, the cost information is less accurate than those returned with equality predicates. Second, it is possible to return the same missing index suggestion with multiple cost estimates. How and where the missing index would be leveraged may change the cost estimate that is calculated. For each cost estimate, a missing index suggestion will be logged.

Lastly, the missing index tool is limited in the types of indexes it can suggest. The main limitation is index types and the inability of missing indexes to suggest clustered, XML, spatial, or columnstore indexes. The suggestions also will not include information on when to make an index filtered. Along these same lines, suggestions may, at times, contain only INCLUDE columns. When this happens, one of the INCLUDE columns will need to be designated as the key column.

Note  Missing index information for a table will be dropped whenever there are metadata operations made on the table. For instance, when a column is added to a table, the missing index information will be dropped. A less obvious example is when an index on a table changes. In this case as well, the missing index information will be dropped.

Explaining the DMOs

There are four DMOs that can be used to return information on missing indexes. Each DMO provides a portion of the information needed to build indexes that the query optimizer can use to improve the performance of a query. The DMOs for missing indexes are as follows:

  • sys.dm_db_missing_index_details
  • sys.dm_db_missing_index_columns
  • sys.dm_db_missing_index_group_stats
  • sys.dm_db_missing_index_group

In the next four sections, I’ll review each of the dynamic management objects and look at how each provides information on how to identify missing indexes.

sys.dm_db_missing_index_details

The DMO sys.dm_db_missing_index_details is a dynamic management view that returns a list of missing index suggestions. Each row in the dynamic management view (DMV) provides a single suggested missing index. The columns in Table 1 provide information on the database and the table to create the index on. It also includes the columns that should comprise the key and the included columns for the index.

Table1. Columns in sys.dm_db_missing_index_details

Column Name

Data Type

Description

index_handle

int

Unique identifier for each missing index suggestions. This is the key value for this DMV.

database_id

smallint

Identifies the database where the table with the missing index resides.

object_id

int

Identifies the table where the index is missing.

equality_columns

nvarchar(4000)

Comma-separated list of columns that contribute to equality predicates.

inequality_columns

nvarchar(4000)

Comma-separated list of columns that contribute to inequality predicates.

included_columns

nvarchar(4000)

Comma-separated list of columns needed as covering columns for the query.

statement

nvarchar(4000)

Name of the table where the index is missing.

There are two columns in sys.dm_db_missing_index_details that are used to identify key columns on missing index suggestions. These are equality_columns and inequality_columns. The equality_columns are generated when there is a comparison in the query plan that makes a direct comparison. For instance, when the filter for a query is ColumnA = @Parameter, this is an equality predicate. The inequality_columns details are created when any nonequal filter is used in a query plan. Examples of this are when there are greater than, less than, or NOT IN comparisons being used.

When it comes to the included_columns information, this is generated when there are columns that are not part of the filter but that would be used to allow the index to cover the query request using a single index. Suffice it to say, the use of included columns will help prevent the query plan from having to use a key lookup in the execution plan if the missing index is created.

sys.dm_db_missing_index_columns

The next DMO is sys.dm_db_missing_index_columns, which is a dynamic management function (DMF). This function returns a list of columns for each missing index listed in sys.dm_db_missing_index_details. To use the DMF, an index_handle is passed into the function as a parameter. Each row in the resultset represents a column in the missing index suggestion from sys.dm_db_missing_index_details and repeats the information in equality_columns, inequality_columns, and included_columns. Table 2 lists the output for sys.dm_db_missing_index_columns.

Table 2. Columns in sys.dm_db_missing_index_columns

Column Name

Data Type

Description

column_id

int

ID of the column

column_name

sysname

Name of the table column

column_usage

varchar(20)

Description of how the column will be used in the index

The primary information in this DMF is the column_usage column. For every row, this column will return one of the following values: EQUALITY, INEQUALITY, or INCLUDE. These values map to equality_columns, inequality_columns, and included_columns in sys.dm_db_missing_index_details. Depending on the type of usage in the former DMV, the use will be the same for this DMF.

sys.dm_db_missing_index_groups

The DMV sys.dm_db_missing_index_groups is the next missing index DMO. The DMV returns a list of missing index groups paired with missing index suggestions. Table 3 lists the columns for sys.dm_db_missing_index_groups. Although this DMV supports the ability for many-to-many relationships within missing index suggestions, they are always made in a one-to-one relationship.

Table 3. Columns in sys.dm_db_missing_index_groups

Column Name

Data Type

Description

index_group_handle

int

Identifies a missing index group. This value joins to group_handle in sys.dm_db_missing_index_group_stats.

index_handle

int

Identifies a missing index handle. This value joins to index_handle in sys.dm_db_missing_index_details.

 sys.dm_db_missing_index_group_stats

The last missing index DMO is the DMV sys.dm_db_missing_index_group_stats. The information in this DMV contains statistics on how the query optimizer would expect to use the missing index if it were built. From this, using the columns in Table 4, you can determine which missing indexes would provide the greatest benefit and the scope to which the index will be used.

Table 4. Columns in sys.dm_db_missing_index_group_stats

Column Name

Data Type

Description

group_handle

int

Unique identifier for each missing index group. This is the key value for this DMV. All queries that would benefit from using the missing index group are included in this group.

unique_compiles

bigint

Count of the execution plan compilations and recompilations that would benefit from this missing index group.

user_seeks

bigint

Count of seeks in user queries that would have occurred if the missing index had been built.

user_scans

bigint

Count of scans in user queries that would have occurred if the missing index had been built.

last_user_seek

datetime

Date and time of last user seek from user queries that would have occurred if the missing index had been built.

last_user_scan

datetime

Date and time of last user scans from user queries that would have occurred if the missing index had been built.

avg_total_user_cost

float

Average cost of the user queries that could be reduced by the index in the group.

avg_user_impact

float

Average percentage benefit that user queries could experience if this missing index group had been implemented.

system_seeks

bigint

Count of seeks in system queries that would have occurred if the missing index had been built.

system_scans

bigint

Count of scans in system queries that would have occurred if the missing index had been built.

last_system_seek

datetime

Date and time of last system seek from system queries that would have occurred if the missing index had been built.

last_system_scan

datetime

Date and time of last system scans from system queries that would have occurred if the missing index had been built.

avg_total_system_cost

float

Average cost of the system queries that could be reduced by the index in the group.

avg_system_impact

float

Average percentage benefit that system queries could experience if this missing index group had been implemented.

Using the DMOs

Now that the missing index DMOs have been explained, it is time to look at how they can be used together to provide missing index suggestions. You may have noticed that the results of the missing index DMOs have been referred to as suggestions instead of recommendations. This variation in wording is intentional. Typically, when someone receives a recommendation, it is fully thought through and ready to be implemented. This is not so with the missing index DMOs; thus, they are referred to as suggestions.

With the suggestions from the missing index DMOs, you have a starting point to begin looking at and building new indexes. There are two things that are important to consider when looking at missing index suggestions. First, variations of each missing index suggestion may appear multiple times in the results. It is not recommended that each of these variations be implemented. Common patterns within the suggestions should be found. An index that covers a few of the suggestions is usually ideal. Second, when more than one column is suggested, the order of the columns needs to be tested to determine which is optimal.

To help explain how the missing index DMOs work and are related to one another, I’ll walk you through an example that includes a few SQL statements. These statements, shown in Listing 1, execute a few queries against the SalesOrderHeader table in the AdventureWorks2014 database. For each of the queries, the filtering is on either the DueDate or OrderDate column, or both.

If you examine the execution plan for any of the example queries, you’ll see that they each use a clustered index scan to satisfy the query. Figure 1 shows the execution plan for the first query. In this execution plan, there is an indication that there is a missing index that could help improve the performance of the query.

qry1

Figure 1. Execution plan for missing index SELECT query

To see more details on this missing index suggestion, you need to look at the missing index DMOs. A query against the missing index DMOs will look similar to Listing 2. The query includes the equality, inequality, and included column information that was described earlier. The query includes two calculations not previously described: the calculations for Impact and Score.

The Impact calculation helps identify missing index suggestions that will have the highest overall impact across multiple query executions. This is calculated by adding the potential seeks and scans on the missing index based on the average impact; the resulting value represents the total improvement across all queries that might have used the index. The higher the value, the more improvement the index could provide.

The Score calculation also helps to identify missing index suggestions that will improve query performance. The difference between Impact and Score is the inclusion of the average total user cost. For the Score calculation, the average total user cost is multiplied by the Impact score and divided by 100. The inclusion of the cost value helps differentiate between expensive and inexpensive queries when deciding whether to consider the missing index. For instance, a missing index suggestion that provides an 80 percent improvement on queries with an average cost value of 1,000 would likely provide a better return that a 90 percent improvement for a query with an average cost value of 1.

Figure 2 shows some results from executing this query.

fig2

Figure 2. Results from missing index query

With the results from the missing index query, shown in Figure 2, there are a few items to consider from these suggestions. First, there are quite a few similarities between the suggestions. The predicate columns between each of the suggestions include the OrderDate and DueDate, except for one missing index. Since the column order has not been tested, the optimal column order could go either way. To satisfy the missing index suggestion, one possible index could have the key column DueDate followed by OrderDate. This configuration would create an index that would satisfy all four of the missing index items.

The next item to look at is included_columns. For two of the suggestions, there are included_columns values listed. On the fourth missing index suggestion, it suggests including the column OrderDate. Since it will be one of the key columns of the index, it doesn’t need to be included. The other column, from the third missing index suggestion, is the CustomerID column. While only one index needs this column, as an included column, the addition of this column would likely be negligible since it is a narrow column. You would also want to add this column to the index.

After looking at these results, you’ve seen four missing index suggestions and ended up with a suggestion for one index that can cover all four of the missing index items. If you build the index using a DDL statement similar to that in Listing 3, you will end up with an index that solves these missing indexes. If you execute the queries in Listing 1 again, you can see this for yourself.

Excerpt from the book “Expert Performance Indexing in SQL Server by Jason Strate & Grant Fritchey published by APRESS

An Excellent read if you are interested.

Related Article:  Database Engine Ttuning Advisor

For all your application development needs, visit www.verbat.com for a fiscally conscious proposal that meets your needs ( So I can keep this blog going as well!!!!)

Alternatively click through the link   if you found this article interesting. (This will help the companies Search engine rankings)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s