june 27, 2024
Index fragmentation removal

Index fragmentation occurs when data that is logically stored sequentially on a disk is actually stored in a random order. Any changes to the data increase fragmentation.
In our practice, we encountered a case where the average index fragmentation in a client’s database exceeded 50%, which is significantly higher than the critical threshold of 30%.
Such high fragmentation considerably increases the time it takes to read data from a table. In this article, we will discuss what fragmentation is, why it is problematic, and the methods to eliminate it.
The emergence of fragmentation
The database file, which stores the data, is logically divided into equal parts of 8 KB. These parts are called pages. When attempting to insert or update data that will cause a page to overflow, SQL Server splits it roughly in half. After this operation, the data is stored on two half-empty pages. This splitting causes logically adjacent pages to be physically located far apart from each other. Figure 1 illustrates an attempt to insert a new value «3» into a full page. Since there is no space on the page, a new one is created, onto which half of the data («4» and «5») from the first page is moved, resulting in two partially filled pages.

Figure 1. Page splitting during insertion
The newly created page is located in a random place on the disk, so the data is not physically stored sequentially. This non-sequential arrangement of pages is called external fragmentation. The split pages contain empty space, which is called internal fragmentation. Both external and internal fragmentation increase the time it takes to read data.
Index fragmentation leads to increased time for searching and reading data. If data is stored on 200 pages but only occupies space equivalent to 100 pages, the reading process will be approximately twice as slow. Therefore, it is important to monitor and eliminate fragmentation.
External fragmentation
Information about index fragmentation can be obtained from the table-valued function sys.dm_db_index_physical_stats. You can specify the database ID, table ID, and index ID for this function. The following query finds indexes in the current database with a fragmentation percentage greater than 5%.

In our practice, we consider an index to have high fragmentation if it exceeds 5%. If the fragmentation level surpasses 30%, the index is considered to have a critical level of fragmentation.
Internal fragmentation
The fill factor determines the fullness of pages when an index is created. With a fill factor of 20, pages will initially be filled to 20%. This reduces the number of page split operations and external fragmentation but increases read times by five times and increases internal fragmentation. By default, the fill factor is set to 0, which means pages are fully filled when the index is created.
In most cases, the benefits of reducing the number of page split operations are offset by the increased read times, so it is recommended to leave the fill factor at 0 or 100.
You can change the fill factor value in the index settings, as shown in Figure 2.

Figure 2. Changing the page fill factor
Correcting fragmentation
To eliminate fragmentation, you need to rebuild or reorganize the index. Rebuilding removes and recreates the index. This process can require locking the table, making it impossible to read from it during the operation. As a result of rebuilding, the newly created index will have almost no fragmentation. You can recreate the index with the following query:

This query rebuilds the indexes of the table named <object>. You can rebuild a specific index by specifying its name (index_name), or all indexes of the table by specifying ALL.
Reorganization arranges the pages of an existing index according to their logical order. This operation requires fewer resources than rebuilding the index and does not necessitate table locking. The query to reorganize an index is as follows:

You can also reorganize all or a specific index of the table <object>.
Conclusion
It is impossible to completely avoid index fragmentation. Therefore, it is important to continuously monitor this process. It is rational to automate index updates. In our projects, the best practice is to check for fragmentation and rebuild indexes on a scheduled basis (daily).
This approach reduces the space occupied by indexes by half and increases data read speeds by 1.5 to 2.5 times for tables where index fragmentation was critical.