SQL Server Index Myths
Myths result in an unnecessary burden when attempting to build an index. Knowing the myths associated with indexes can prevent you from using indexing strategies that will be counterproductive.
One of the problems that people encounter when building databases and indexes is dealing with myths. Indexing myths originate from many different places. Some come from previous versions of SQL Server and its tools or are based on former functionality. Others come from the advice of others, based on conditions in a specific database that don’t match those of other databases.
The trouble with indexing myths is that they cloud the water of indexing strategies. In situations where an index can be built to resolve a serious performance issue, a myth can sometimes prevent the approach from being considered. Throughout the next few sections, I’ll cover a number of myths regarding indexing and do my best to dispel them.
The following are the indexing myths discussed in this chapter:
- Databases don’t need indexes.
- Primary keys are always clustered
- Online index operations don’t block
- Any column can be filtered in multi-column indexes
- Clustered indexes store records in physical order.
- Indexes always output in the same order.
- Fill factor is applied to indexes during inserts
- Deleting form heaps results in unrecoverable space
- Every table should be a heap or have a clustered index
When reviewing myths, it’s also a good idea to take a look at best practices. Best practices are like myths in many ways, in the sense that they are commonly held beliefs. The primary difference is that best practices stand up to scrutiny and are useful recommendations when building indexes. This chapter will examine the following best practices:
- Use clustered indexes on primary keys by default.
- Balance index count.
- Properly target database level fill factors.
- Properly target index level fill factors.
- Index foreign key columns.
- Index to your environments
Index Best Practices
Similar to myths are the indexing best practices. A best practice should be considered the default recommendations that can be applied when there isn’t enough information available to validate proceeding in another direction. Best practices are not the only option and are just a place to start from when working with any technology.
When using a best practice provided from someone else, such as those appearing in this chapter, it is important to check them out for yourself first. Always take them with a grain of salt. You can trust that best practices will steer you in the correct direction, but you need to verify that it is appropriate to follow the practice.
Given the preceding precautions, there are a number of best practices that can be considered when working with indexes. This section will review these best practices and discuss what they are and what they mean.
Use Clustered Indexes on Primary Keys by default
The first best practice is to use clustered indexes on primary keys by default. This may seem to run contrary to the seventh myth presented in this chapter. Myth 7 discussed whether to choose clustered indexes or heaps as a matter of doctrine. Whether the database was built with one or the other, the myth would have you believe that if your table design doesn’t match the myth, it should be changed regardless of the situation. This best practice recommends using clustered indexes on primary keys as a starting point.
By clustering the primary key of a table by default, there is an increased likelihood that the indexing choice will be appropriate for the table. As stated earlier in this chapter, clustered indexes control how the data in a table is stored. Many primary keys, possibly most, are built on a column that utilizes the identity property that increments as each new record is added to the table. Choosing a clustered index for the primary key will provide the most efficient method to access the data.
Balance Index Count
As previously discussed in this book, indexes are extremely useful for improving the performance when accessing information in a record. Unfortunately, indexes are not without costs. The costs to having indexes go beyond just space within your database. When you build an index, you need to consider some of the following:
- How frequently will records be inserted or deleted?
- How frequently will the key columns be updated?
- How often will the index be used?
- What processes does the index support?
- How many other indexes are on the table?
These are just some of the first considerations that need to be accounted for when building indexes. After the index is built, how much time will be spent updating and maintaining the index? Will you modify the index more frequently than the index is used to return results for queries?
The trouble with balancing the index count on a table is that there is no precise number that can be recommended. Deciding on the number of indexes that it makes sense to have on an index is a per-table decision. You don’t want too few, which may result in excessive scans of the clustered index or heap to return results. Also, the table shouldn’t have too many indexes, where more time is being spent keeping the index current than returning results. As a rule of thumb, if a table has more than ten indexes on it in a transactional system, it is increasingly likely that there are too many indexes on the table.
Specify Fill Factors
Fill factor controls the amount of free space left on the data pages of an index after an index is built or defragmented. This free space is made available to allow for records on the page to expand with the risk that the change in record size may result in a page split. This is an extremely useful property of indexes to use for index maintenance. Modifying the fill factor can mitigate the risk of fragmentation.
Database Level Fill Factor
As already mentioned, one of the properties of SQL Server is the option to set a default fill factor for indexes. This setting is a SQL Server–wide setting and can be altered in the properties of SQL Server on the Database Properties page. By default, this value is set to zero, which equates to 100. Do not modify the default fill factor to anything other than 0, or 100, which has the same impact. Doing so will change the fill factor for every index in the database to the new value; this will add the specified amount of free space to all indexes the next time indexes are created, rebuilt, or reorganized.
On the surface this may seem like a good idea, but this will blindly increase the size of all indexes by the specified amount. The increased size of the indexes will require more I/O to perform the same work as before the change. For many indexes, making this change would result in a needless waste of resources.
Index Level Fill Factor
At the index level, you should modify the fill factor for indexes that are frequently becoming heavily fragmented. Decreasing the fill factor will increase the amount of free space in the index and provide additional space to compensate for the changes in record length leading to fragmentation. Managing fill factor at the index level is appropriate since it provides the ability to tune the index precisely to the needs of the database.
Index Foreign Key Columns
When a foreign key is created on a table, the foreign key column in the table should be indexed. This is necessary to assist the foreign key in determining which records in the parent table are constrained to each record in the referenced table. This is important when changes are being made against the referenced table. The changes in the referenced table may need to check all the rows that match the record in the parent table. If an index does not exist, then a scan of the column will occur. On a large parent table, this could result in a significant amount of I/O and potentially some concurrency issues.
An example of this issue would be a state and address table. There would likely be thousands or millions of records in the address table and maybe a hundred records in the state table. The address table would include a column that is referenced by the state table. Consider whether one of the records in the state table needed to be deleted. If there wasn’t an index on the foreign key column in the address table, then how would the address table identify the rows that would be affected by deleting the state record? Without an index, SQL Server would have to check every record in the address table. If the column is indexed, SQL Server would be able to perform a range scan across the records that match to the value being deleted from the state table.
By indexing your foreign key columns, performance issues, such as the one described in this section, can be avoided. The best practice with foreign keys is to index their columns.
Index to Your Environment
The indexing that exists today will likely not be the indexing that will be needed in databases in the future. For this reason, the last best practice is to continuously review, analyze, and implement changes to the indexes in your environment. Realize that regardless of how similar two databases are, if the data in the databases is not the same, then the indexing for the two databases may also be different.
For all your application development needs, visit www.verbat.com for a fiscally conscious proposal that meets your needs ( So I can keep this blog going as well!!!!)
Alternatively click through the link if you found this article interesting. (This will help the companies Search engine rankings)
Leave a Reply