Missing Stats DMV?
Useful statistics are critical for great performance in SQL Server. So much so, that by default, SQL Server creates statistics for columns it deems worthy without any help or interjection from anyone.1 However, SQL Server isn’t yet quite smart enough to automatically create multi-column stats objects that might be helpful. It would be great to have a built-in missing stats DMV. For those wondering if multi-column stats are actually used by SQL Server’s cardinality estimator, this Microsoft blog post shows SQL Server 2016 compatibility level 130 making use of multi-column stats to provide better row estimates.
SQL Server does have a fairly useful dynamic management view, or DMV, which provides insight that can be leveraged in this area. The DMV I’m talking about is the set of DMVs around missing indexes, consisting of sys.dm_db_missing_index_groups
, sys.dm_db_missing_index_details
, etc. I’m not saying the missing indexes DMVs are a panacea that will enable you to fix every performance situation you run into, but they can be useful if you know where to look. This post doesn’t go into a lot of depth about how to use those DMVs for the purpose of actually creating indexes, however I will show you how you can create multi-column stats objects as an interim performance booster while evaluating the need for those indexes.
The script below compares the details provided by the missing index DMVs against the existing sys.stats
and sys.stats_columns
views to provide a list of potential new stats.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
;WITH suggested_indexes AS ( SELECT ObjectName = QUOTENAME(s.name) + N'.' + QUOTENAME(o.name) , Columns = STUFF(( SELECT N', ' + QUOTENAME(mic.column_name) FROM sys.dm_db_missing_index_columns(mig.index_handle) mic WHERE mic.column_usage = N'EQUALITY' OR mic.column_usage = N'INEQUALITY' ORDER BY mic.column_name FOR XML PATH(N'') ), 1, 2, '') FROM sys.dm_db_missing_index_groups mig INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle INNER JOIN sys.objects o ON mid.object_id = o.object_id INNER JOIN sys.schemas s ON o.schema_id = s.schema_id WHERE mid.database_id = DB_ID() ) , existing_stats AS ( SELECT ObjectName = QUOTENAME(sch.name) + N'.' + QUOTENAME(o.name) , StatsName = s.name , AutoCreated = s.auto_created , UserCreated = s.user_created , StatsColumns = STUFF((SELECT N', ' + QUOTENAME(c.name) FROM sys.stats_columns sc INNER JOIN sys.columns c ON sc.object_id = c.object_id AND sc.column_id = c.column_id WHERE s.object_id = sc.object_id AND s.stats_id = sc.stats_id ORDER BY sc.stats_column_id FOR XML PATH(N'') ), 1, 2, N'') , FilterDefinition = CASE WHEN s.has_filter = 1 THEN s.filter_definition ELSE N'' END FROM sys.stats s INNER JOIN sys.objects o ON s.object_id = o.object_id INNER JOIN sys.schemas sch ON o.schema_id = sch.schema_id WHERE o.is_ms_shipped = 0 ) SELECT * , CreateStmt = N'CREATE STATISTICS [stats_name_' + CONVERT(nvarchar(30), CONVERT(bigint, CRYPT_GEN_RANDOM(4))) + N'] ON ' + si.ObjectName + N'(' + si.Columns + N') WITH FULLSCAN;' FROM suggested_indexes si WHERE NOT EXISTS (SELECT 1 FROM existing_stats es WHERE es.ObjectName = si.ObjectName AND es.StatsColumns = si.Columns ) GROUP BY si.ObjectName , si.Columns ORDER BY si.ObjectName , si.Columns; |
The results show two columns, ObjectName
and Columns
. These columns can be used in a CREATE STATISTICS
statement. To save a small amount of time, the third column provides a template CREATE STATISTICS statement. Ensure you don’t just blindly create all statistics presented here; do a quick sanity check to ensure they make sense for your environment.
Let me know if this “missing stats DMV” is useful, or if you think I’m crazy for doing this.
See the rest of our series on performance.
1 Assuming the database has auto_create_stats
enabled. See https://docs.microsoft.com/en-us/sql/relational-databases/statistics/statistics?view=sql-server-2017#AutoUpdateStats for more details.
[…] Hannah Vernon shows how you can use the missing index DMVs to find potential candidates for multi-colum…: […]