Total Pageviews

Thursday, July 4, 2013

Index optimization - REBUILD vs. REORGANIZE

Index optimization is probably one of the most critical task every database support personnel has to perform on a regular basis. Based on DML operations in a particular environment, we adopt various optimization tasks, steps and strategies that suit our needs. Some tables or indexes may need frequent optimization, some do not need it at all for a longer period of time.

To optimize an index we have two options, REBUILD and REORGANIZE. Both work differently and have different effects. There are some differences which we should be aware of for better understanding of how each T-SQL command works and what does it do for us.

Good to Know some key points:
1.      When a non-clustered index is rebuilt, only the associate statistics for this index will be rebuilt.
2.      Rebuilding a clustered index does not rebuild associated non-clustered indexes unless the keyword ALL is specified.
3.      “HEAP” cannot be optimized. If “ALL” is specified and the underlying table is a heap, the rebuild operation has no effect on the table. Any non-clustered indexes associated with the table are rebuilt.
4.      The rebuild operation can be minimally logged if the database recovery model is set to either bulk-logged or simple.
5.      The options ONLINE = ON and IGNORE_DUP_KEY = ON are not valid while rebuilding an XML index or a spatial index.
6.      “STATISTICS_NORECOMPUTE = ON” means Out-of-date statistics are not automatically recomputed. “STATISTICS_NORECOMPUTE = OFF” means automatic statistic updating is enabled.
7.      If index options are not specified, the existing index option values stored in sys.indexes will be used.
8.      ALTER INDEX cannot be used to repartition an index or move it to a different filegroup. This statement cannot be used to modify the index definition, such as adding or deleting columns or changing the column order.
9.      The values for ONLINE, MAXDOP, and SORT_IN_TEMPDB are not stored in the system catalog. You need to specify this OPTION in the index rebuild statement.
10. Reorganizing an index is always performed online. The process does not hold locks long term hence it does not block queries or updates that are running.
11. When you execute ALTER INDEX ALL … on a table, only the statistics associated with indexes are updated. Automatic or manual statistics created on the table will not be updated.
12. Index REBUILD can be a Parallel operation. Index REORGANIZE is always serial operation.
13. Rebuilding or reorganizing small indexes (which has 128 or less extents) often does not reduce fragmentation.
14. Reorganizing an index uses minimal system resources and also compacts the index pages.
15. Reorganizing an index does not update associate statistics.

Index Optimization Guideline:
The guideline that Microsoft has provided in the MSDN is a general guideline regardless of any DML operations happening in the database which need to be further reviewed by the database administrator based on his/her workload scenario to establish a better threshold.

The sys.dm_db_index_physical_stats can be used to determine fragmentation levels in a specific index, in all indexes on a table or indexed view, in all indexes in a database, or in all indexes in all databases. The following table explains three important columns of the system function which need to be researched closely:

Column
Description
avg_fragmentation_in_percent
The percent of logical fragmentation (out-of-order pages in the index).
fragment_count
The number of fragments (physically consecutive leaf pages) in the index.
avg_fragment_size_in_pages
Average number of pages in one fragment in an index.

Action threshold recommended by Microsoft.
avg_fragmentation_in_percent
T-SQL Command
> 5% and < = 30%
ALTER INDEX REORGANIZE
> 30%
ALTER INDEX REBUILD (All Edition)
ALTER INDEX REBUILD WITH (ONLINE = ON) (Enterprise Edition)
Number of Extents > 128
Will be a good candidate for index optimization

The above threshold is a recommendation only. As every environment is different therefore it is a good idea to research the best threshold that will suit your need.

DMV Query:
The following DMV query can be used to pull detail information about indexes.
/*********************************************************************************
Script: Index Fragmentation Status (includes Partitioned Tables/Indexes)
**********************************************************************************/
select  schema_name(o.schema_id) as [schema_name] ,
        object_name(o.object_id) as [table_name] ,
        i.name as [index_name] ,
        i.type_desc as [index_type] ,
        dmv.page_count ,
        dmv.fragment_count ,
        round(dmv.avg_fragment_size_in_pages, 2, 2) [avg_fragment_size_in_pages] ,
        round(dmv.avg_fragmentation_in_percent, 2, 2) [avg_fragmentation_in_percent] ,
        case when dmv.avg_fragmentation_in_percent <= 5 then 'RELAX'
             when dmv.avg_fragmentation_in_percent <= 30 then 'REORGANIZE'
             when dmv.avg_fragmentation_in_percent > 30 then 'REBUILD'
        end as [action] ,
        stats_date(dmv.object_id, i.index_id) as stats_update_date ,
        case when isnull(ps.function_id, 1) = 1 then 'NO'
             else 'YES'
        end as partitioned ,
        coalesce(fg.name, fgp.name) as [file_group_name] ,
        p.partition_number as [partition_number] ,
        p.rows as [partition_rows] ,
        prv_left.value as [partition_lower_boundary_value] ,
        prv_right.value as [partition_upper_boundary_value] ,
        case when pf.boundary_value_on_right = 1 then 'RIGHT'
             when pf.boundary_value_on_right = 0 then 'LEFT'
             else 'NONE'
        end as [partition_range] ,
        pf.name as [partition_function] ,
        ds.name as [partition_scheme]
from    sys.partitions as p with ( readpast )
        inner join sys.indexes as i with ( readpast ) on i.object_id = p.object_id
                                                         and i.index_id = p.index_id
        inner join sys.objects as o with ( readpast ) on o.object_id = i.object_id
        inner join sys.dm_db_index_physical_stats(db_id(), null, null, null,
                                                  N'LIMITED') dmv on dmv.OBJECT_ID = i.object_id
                                                              and dmv.index_id = i.index_id
                                                              and dmv.partition_number = p.partition_number
        left join sys.data_spaces as ds with ( readpast ) on ds.data_space_id = i.data_space_id
        left join sys.partition_schemes as ps with ( readpast ) on ps.data_space_id = ds.data_space_id
        left join sys.partition_functions as pf with ( readpast ) on pf.function_id = ps.function_id
        left join sys.destination_data_spaces as dds with ( readpast ) on dds.partition_scheme_id = ps.data_space_id
                                                              and dds.destination_id = p.partition_number
        left join sys.filegroups as fg with ( readpast ) on fg.data_space_id = i.data_space_id
        left join sys.filegroups as fgp with ( readpast ) on fgp.data_space_id = dds.data_space_id
        left join sys.partition_range_values as prv_left with ( readpast ) on ps.function_id = prv_left.function_id
                                                              and prv_left.boundary_id = p.partition_number
                                                              - 1
        left join sys.partition_range_values as prv_right with ( readpast ) on ps.function_id = prv_right.function_id
                                                              and prv_right.boundary_id = p.partition_number
where   objectproperty(p.object_id, 'ISMSShipped') = 0
order by [avg_fragmentation_in_percent] DESC,
        [table_name] ,
        [index_name]

Output of the above script:

Good practice:
1.      Try not to DROP an index beforehand and re-create it again. Use ALTER INDEX WITH REBUILD.
2.      To change the index definition, use CREATE INDEX with the DROP_EXISTING clause to perform the operations.
3.      Be careful about “ALL” option.  When “ALL” is specified, all indexes on the table are dropped and rebuilt in a single transaction; Transaction Log will grow rapidly.
4.      Rebuilding indexes ONLINE might need longer time and you still see short duration blocking.
5.      Always choose off-peak hours to optimize indexes and try to use MAXDOP to take advantage of parallel index creation.

Index Optimization Script:
There are a number of automated index optimization scripts available in the net. But the following are two FREE automated scripts you can use to optimize your indexes reliably and efficiently.

Index Defrag Script, v4.1


SQL Server Maintenance Solution

References:
Reorganize and Rebuild Indexes
ALTER INDEX (Transact-SQL)

No comments:

Post a Comment