Check for index fragmentation

By | September 1, 2020

Index fragmentation is bad and makes you queries runs slower than expected, and requires additional space on you hard drive.

SQL Server stores data in 8KB pages. These 8KB pages are stored sequentially in heap tables, that is tables without any clustered index. In tables with clustered indexes (or non-clustered covered indexes) data is stored in a b-tree where each leaf is a row in the database. The difference you might say is that in heap tables data pages are unordered but in clustered tables data pages are “ordered”. That doesn’t mean that data is will be returned in any particular order when queried, it just means that SQL Server maintains some kind of order due to the nature of the b-tree.

When adding new rows to a clustered table, SQL Server will maintain the b-tree and insert rows at the end of current data page. As long as you primary key is incrementing sequentially everything will be fine. Data pages are filled and new ones are created, data is appended. The problem occur when you want to insert a row in between two other rows. If the target data page is full SQL Server will perform a page split where each page will be 50% filled. The behaviour of the page split is something you need to take into consideration when creating your indexes. If you expect many page splits but not too many rows, then you might live with empty space in data pages. But if you have a high amount of rows in the table, the empty space will start to take up too much space and that just doesn’t feel very good does it?

To find the amolunt of index fragmentation you can run the following query

SELECT OBJECT_NAME(ips.OBJECT_ID) INDEX_NAME,
 i.NAME,
 s.index_id,
 index_type_desc,
 avg_fragmentation_in_percent,
 avg_page_space_used_in_percent,
 page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'SAMPLED') s
  INNER JOIN sys.indexes i
    ON (s.object_id = i.object_id AND s.index_id = i.index_id
ORDER BY avg_fragmentation_in_percent DESC

In the result you can both see the amount of fragmentation and the space used.

There is a general rule that index with fragmentation > 5% can be rebuilt with ALTER INDEX REORGANIZE, and that indexes with fragmentation > 30% should be ALTER INDEX REBUILD.

More information here https://docs.microsoft.com/en-us/sql/relational-databases/indexes/reorganize-and-rebuild-indexes?view=sql-server-ver15.

And here is a brief warning to you guys. UUID is very popular when you need an id in your data layer but can’t (won’t) ask the database for it. But UUID is in its nature non-sequential, e.g. NEWID() is actually non-sequential, which means that you will get many page splits using it.

So even though you can rebuild and maintain the index, that is not the only solution. Use sequential ids instead, e.g.. NEWSEQUENTIALID(). For other languages, just Google it and you will find tons of sequential UUID implementations.

That’s it I think. Mind you indexes.