Recompress Heaps
Heaps, or tables without a clustered index, suffer from a little-known problem when configured with page compression. Page compression only takes place once a page is 100% full, i.e. when no more rows can be added to the page. However, since the page header for heap pages isn’t reliably updated to indicate the remaining free space on the page, most pages in a heap will never become 100% full, and hence page compression will never run on those pages. For a table configured with page compression, row compression is actually used for each row added to the page. Once there is no more room for new rows on the page, SQL Server uses page compression on the page. Solomon Rutzky’s answer on dba.stackexchange.com has some great details about heap compression.
The script below allows you to recompress heaps; it automates a REBUILD action for page-compressed-heaps that have a minimum percentage of compressed pages in the heap. You can configure a cut-off, such that heaps where 90% of the pages are page-compressed will not be rebuilt, but any heap with less than that will be rebuilt. The act of rebuilding a page-compressed heap automatically fills each page to capacity, which causes page compression to take place. As an aside, the rebuild operation for a heap also triggers a rebuild of all non-clustered indexes defined on the table, thereby causing any page-compressed non-clustered index to be re-compressed, which may result in further space savings.
Here’s the script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
/* Re-compresses page-compressed heaps that have less than "x" percentage of compressed pages. Pages that are added to heaps are not page compressed until the page is completely full. Completely full for a heap almost never happens unless the page data is inserted via BULK INSERT with a TABLOCKX. By: Hannah Vernon Date: 2019-08-21 */ SET NOCOUNT ON; DECLARE @DebugOnly bit = 0; /* Set to 1 to see what actions would be taken. Set to 0 to actually rebuild heaps. */ DECLARE @CutOffPercentage decimal(5,2) = 10.0; /* any heap where the percentage of uncompressed pages is greater than this number will be rebuilt */ DECLARE @OnlineOnly bit = 1; /* favor ONLINE=ON rebuild operations where possible */ DECLARE @ObjectName nvarchar(261); DECLARE @ObjectID int; DECLARE @CountOfPages bigint; DECLARE @CountOfCompressedPages bigint; DECLARE @ObjectOfflineOnly bit; DECLARE @cmd nvarchar(max); DECLARE @MaxDOP int; DECLARE @msg nvarchar(1000); SET @MaxDOP = ( --set our MaxDOP to half of the configured value, or 1 if the sysadmin --hasn't explicity configured the value SELECT COALESCE(CONVERT(int, c.value), 2) FROM sys.configurations c WHERE c.name = N'max degree of parallelism' ) / 2; DECLARE cur CURSOR LOCAL FORWARD_ONLY STATIC READ_ONLY FOR SELECT ObjectName = QUOTENAME(s.name) + N'.' + QUOTENAME(t.name) , ObjectID = t.object_id , OfflineOnly = COALESCE( ( SELECT 1 FROM sys.columns c INNER JOIN sys.types ty ON c.system_type_id = ty.system_type_id WHERE c.object_id = t.object_id AND ( ty.name = N'ntext' --none of these column types can be compressed online OR ty.name = N'text' OR ty.name = N'image' ) ), 0) FROM sys.tables t INNER JOIN sys.schemas s ON t.schema_id = s.schema_id INNER JOIN sys.indexes i ON t.object_id = i.object_id AND i.index_id = 0 /* only heaps */ WHERE t.is_ms_shipped = 0 /* only user heaps */ AND EXISTS ( /* only compressed heaps */ SELECT 1 FROM sys.partitions p WHERE t.object_id = p.object_id AND i.index_id = p.index_id AND p.data_compression_desc = N'PAGE' ) ORDER BY QUOTENAME(s.name) + N'.' + QUOTENAME(t.name); OPEN cur; FETCH NEXT FROM cur INTO @ObjectName, @ObjectID, @ObjectOfflineOnly; WHILE @@FETCH_STATUS = 0 BEGIN SET @CountOfPages = 0; SET @CountOfCompressedPages = 0; SET @msg = N'Inspecting pages for ' + @ObjectName; RAISERROR (@msg, 0, 1) WITH NOWAIT; SELECT @CountOfCompressedPages = SUM(dpa.is_page_compressed) , @CountOfPages = COUNT_BIG(1) FROM sys.dm_db_database_page_allocations(DB_ID(), @ObjectID, NULL, NULL, 'DETAILED') dpa LEFT JOIN sys.databases d ON dpa.database_id = d.database_id LEFT JOIN sys.objects o ON dpa.object_id = o.object_id LEFT JOIN sys.indexes i ON dpa.object_id = i.object_id AND dpa.index_id = i.index_id; SET @msg = @ObjectName + N' has ' + CONVERT(nvarchar(100), @CountOfPages) + N', of which ' + CONVERT(nvarchar(100), @CountOfPages - @CountOfCompressedPages) + N' are not page-compressed. Ratio is ' + CONVERT(nvarchar(100), (CONVERT(decimal(38,0), @CountOfPages - @CountOfCompressedPages) / CONVERT(decimal(38,0), @CountOfPages)) * 100); RAISERROR (@msg, 0, 1) WITH NOWAIT; IF (CONVERT(decimal(38,0), @CountOfPages - @CountOfCompressedPages) / CONVERT(decimal(38,0), @CountOfPages)) * 100 > @CutOffPercentage BEGIN SET @cmd = N'ALTER TABLE ' + @ObjectName + N' REBUILD WITH ( DATA_COMPRESSION = PAGE , ONLINE = ' + CASE WHEN @OnlineOnly = 1 AND @ObjectOfflineOnly = 0 THEN N'ON' ELSE N'OFF' END + N' , SORT_IN_TEMPDB = OFF , MAXDOP = ' + CONVERT(nvarchar(10), @MaxDOP) + N' ); ' IF @DebugOnly = 1 BEGIN RAISERROR (@cmd, 0, 1) WITH NOWAIT; END ELSE BEGIN SET @msg = N'Rebuilding ' + @ObjectName; RAISERROR (@msg, 0, 1) WITH NOWAIT; EXEC sys.sp_executesql @cmd; END END FETCH NEXT FROM cur INTO @ObjectName, @ObjectID, @ObjectOfflineOnly; END CLOSE cur; DEALLOCATE cur; |
For a script that compresses tables over a certain size where compression would be beneficial, check this prior post.
Let me know if you have any questions or suggestions about this script that can be used to automatically recompress heaps. Also, check out the rest of our SQL Server Tools.