CHECKPOINT and Simple Recovery Model
This article shows how checkpoint and simple recovery model works. For databases configured with the “simple” recovery model, it’s important to understand how checkpoints work1 under simple recovery model to avoid running out of log space. Checkpointing is the process of ensuring changes recorded in memory have been committed to the data file(s) in the affected database. Microsoft defines the checkpoint process like this:
… the Database Engine performs modifications to database pages in memory-in the buffer cache-and does not write these pages to disk after every change. Rather, the Database Engine periodically issues a checkpoint on each database. A checkpoint writes the current in-memory modified pages (known as dirty pages) and transaction log information from memory to disk…
The CHECKPOINT command can be used to manually force a checkpoint, which can be useful when rolling back a transaction if you run out of log space.
The code in this post shows how rolling back a transaction may not actually free the used log space, even under the simple recovery model. This is true even for indirect checkpoints issued by SQL Server 2016 and newer.
Monitoring for Checkpoints
Since we’re interested in the checkpoint process, you might want to create an Extended Events session to track checkpoints as detailed in this code.
Once you setup the checkpoint tracking extended events session, you can use the built-in GUI interface to watch the live session data. Alternately, use the code in this post to query the ring buffer for events.
The test-bed
We need to create a new, empty database for use with our minimal, complete, and verifiable example code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
USE master; IF DB_ID(N'LogSpaceTest') IS NOT NULL BEGIN ALTER DATABASE LogSpaceTest SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE LogSpaceTest; END CREATE DATABASE LogSpaceTest ON (NAME = N'LogSpaceTest_Primary' , FILENAME = N'C:\data\LogSpaceTest_Primary.mdf' , SIZE = 8000MB , FILEGROWTH = 100MB , MAXSIZE = 10000MB) LOG ON (NAME = N'LogSpaceTest_Log' , FILENAME = N'C:\log\LogSpaceTest_log.ldf' , SIZE = 8000MB , FILEGROWTH = 1000MB , MAXSIZE = 8000MB); |
In order to ensure the database is actually running in simple recovery model, we need to set the recovery model and perform a database backup. Since this is a test database that we don’t actually care about, I’m going to perform the backup to the NUL
device. Don’t do this on a production database!
1 2 3 |
ALTER DATABASE LogSpaceTest SET RECOVERY SIMPLE; BACKUP DATABASE LogSpaceTest TO DISK = N'NUL'; GO |
Let’s create a table, and add some data:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
USE LogSpaceTest; DROP TABLE IF EXISTS dbo.RollbackTest; CREATE TABLE dbo.RollbackTest ( rid int NOT NULL CONSTRAINT RolbackTest_pkc PRIMARY KEY CLUSTERED IDENTITY(1,1) , someCol varchar(8000) NOT NULL ); INSERT INTO dbo.RollbackTest WITH (TABLOCKX) (someCol) SELECT TOP(1000000) REPLICATE('A', 8000) FROM sys.syscolumns sc1 CROSS JOIN sys.syscolumns sc2; GO |
The code above adds one million rows to the table. Since each row is 8,004 bytes (plus a small amount of overhead), the table is going to consume 1,000,000 * 8192 bytes, or approximately 7,812.5 MB of space. This will consume 80% of our data file. However, the log will not see substantial growth since we are using the WITH (TABLOCKX)
option. This option allows SQL Server to only record the page numbers touched by the DML operation, instead of capturing all the individual rows. If a rollback is required, those pages would simply be marked unallocated, as if they were never touched.
Updating One Million Rows
The second piece of code below updates each row in the table. This operation will require a large amount of log space, since every row will be touched. Before the update, we’re going to show the size of the data and log files, then execute the update statement inside a TRY...CATCH
block, mimicking the way an update is typically coded.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
--check the status of the log SELECT d.name , mf.name , Size = mf.size * 8192E0 / 1048576 , MaxSize = mf.max_size * 8192E0 / 1048576 , d.recovery_model_desc , d.log_reuse_wait_desc FROM master.sys.databases d LEFT JOIN sys.database_files mf ON d.database_id = DB_ID() LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id WHERE d.name = DB_NAME() AND mf.type_desc = N'LOG'; SELECT DatabaseName = DB_NAME() , LogSizeMB = su.total_log_size_in_bytes / 1048576 , LogUsedMB = su.used_log_space_in_bytes / 1048576 , LogUsedPercent = ROUND(su.used_log_space_in_percent, 2) FROM sys.dm_db_log_space_usage su; |
The output from the above two queries look like:
╔══════════════╦══════════════════╦══════╦═════════╦═════════════════════╦═════════════════════╗ ║ name ║ name ║ Size ║ MaxSize ║ recovery_model_desc ║ log_reuse_wait_desc ║ ╠══════════════╬══════════════════╬══════╬═════════╬═════════════════════╬═════════════════════╣ ║ LogSpaceTest ║ LogSpaceTest_Log ║ 8000 ║ 8000 ║ SIMPLE ║ NOTHING ║ ╚══════════════╩══════════════════╩══════╩═════════╩═════════════════════╩═════════════════════╝ ╔══════════════╦═══════════╦═══════════╦════════════════╗ ║ DatabaseName ║ LogSizeMB ║ LogUsedMB ║ LogUsedPercent ║ ╠══════════════╬═══════════╬═══════════╬════════════════╣ ║ LogSpaceTest ║ 7999 ║ 35 ║ 0.44 ║ ╚══════════════╩═══════════╩═══════════╩════════════════╝
The code that fills the log:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
--run out of log space here. BEGIN TRANSACTION BEGIN TRY UPDATE dbo.RollbackTest SET someCol = REPLICATE('B', 8000); END TRY BEGIN CATCH DECLARE @msg nvarchar(1000); SET @msg = ERROR_MESSAGE(); PRINT N'rolling back - ' + @msg; ROLLBACK TRANSACTION END CATCH IF @@TRANCOUNT > 0 BEGIN PRINT N'committing'; COMMIT TRANSACTION; END |
The message printed by the code above:
rolling back - The transaction log for database 'LogSpaceTest' is full due to 'ACTIVE_TRANSACTION'.
Check the Transaction Log
If we check the status of the log, we’ll see it is very close to full, and sys.databases
reports there is an active transaction, even though there are no transactions ongoing:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
SELECT d.name , mf.name , Size = mf.size * 8192E0 / 1048576 , MaxSize = mf.max_size * 8192E0 / 1048576 , d.recovery_model_desc , d.log_reuse_wait_desc FROM master.sys.databases d LEFT JOIN sys.database_files mf ON d.database_id = DB_ID() LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id WHERE d.name = DB_NAME() AND mf.type_desc = N'LOG'; SELECT DatabaseName = DB_NAME() , LogSizeMB = su.total_log_size_in_bytes / 1048576 , LogUsedMB = su.used_log_space_in_bytes / 1048576 , LogUsedPercent = ROUND(su.used_log_space_in_percent, 2) FROM sys.dm_db_log_space_usage su; |
Output looks like:
╔══════════════╦══════════════════╦══════╦═════════╦═════════════════════╦═════════════════════╗ ║ name ║ name ║ Size ║ MaxSize ║ recovery_model_desc ║ log_reuse_wait_desc ║ ╠══════════════╬══════════════════╬══════╬═════════╬═════════════════════╬═════════════════════╣ ║ LogSpaceTest ║ LogSpaceTest_Log ║ 8000 ║ 8000 ║ SIMPLE ║ ACTIVE_TRANSACTION ║ ╚══════════════╩══════════════════╩══════╩═════════╩═════════════════════╩═════════════════════╝ ╔══════════════╦═══════════╦═══════════╦════════════════╗ ║ DatabaseName ║ LogSizeMB ║ LogUsedMB ║ LogUsedPercent ║ ╠══════════════╬═══════════╬═══════════╬════════════════╣ ║ LogSpaceTest ║ 7999 ║ 7698 ║ 96.24 ║ ╚══════════════╩═══════════╩═══════════╩════════════════╝
Even though the transaction has been rolled back, the log records will not be cleared until a checkpoint occurs. An automatic checkpoint could be triggered by other ongoing transactions being written to the log, or a manual CHECKPOINT
statement could be executed. However, for a database that is not seeing frequent transactions, the log may stay nearly full for an extended period of time. This scenario might be seen often during development where there are a very limited number of transactions being generated. Being aware of the behavior of automatic checkpoints is important, and is documented in Microsoft Docs as such:
- An automatic checkpoint occurs each time the number of log records reaches the number the Database Engine estimates it can process during the time specified in the recovery interval server configuration option.
-
The Database Engine estimates the maximum number of log records it can process within the recovery interval. When a database using automatic checkpoints reaches this maximum number of log records, the Database Engine issues a checkpoint on the database.
-
Under the simple recovery model, an automatic checkpoint is also queued if the log becomes 70 percent full.
Note, in point 3 above, the documentation states a single checkpoint will be issued when the log becomes 70 % full. Once this checkpoint has been issued, further checkpoints may not be issued even thought the log is over 70% full, even if the automatic checkpoint resulted in no log truncation activity. However, if further transactions are generated, and the database engine believes the recovery time will be longer than the configured recovery interval, SQL Server will issue automatic checkpoints frequently as long as transactions are being generated. If no transactions are being generated, no automatic checkpoints will be issued.
Watching the Extended Events session we setup at the beginning of the post can be instrumental in showing when automatic checkpoints are occurring. They show up in the output with the is_system
column set to True
.
Summary and Follow-up
As shown in this post, the log_reuse_wait_desc
column in sys.databases
may not be as up-to-date as you’d like. If you see ACTIVE_TRANSACTION
with a very full log, try running a CHECKPOINT
command to see if that clears the log and resets the log reuse status to NOTHING
.
Let’s run a CHECKPOINT
, and see:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
CHECKPOINT; SELECT d.name , mf.name , mf.physical_name , Size = mf.size * 8192E0 / 1048576 , MaxSize = mf.max_size * 8192E0 / 1048576 , Growth = mf.growth * 8192E0 / 1048576 , d.recovery_model_desc , d.log_reuse_wait_desc , fg.is_default FROM master.sys.databases d LEFT JOIN sys.database_files mf ON d.database_id = DB_ID() LEFT JOIN sys.filegroups fg ON mf.data_space_id = fg.data_space_id WHERE d.name = DB_NAME() ORDER BY d.name , mf.type , mf.name; IF EXISTS (SELECT 1 FROM sys.system_objects o WHERE o.name = 'dm_db_log_space_usage') BEGIN SELECT DatabaseName = DB_NAME() , LogSizeMB = su.total_log_size_in_bytes / 1048576.0 , LogUsedMB = su.used_log_space_in_bytes / 1048576.0 , LogUsedPercent = su.used_log_space_in_percent FROM sys.dm_db_log_space_usage su; END |
Results!
╔══════════════╦══════════════════╦══════╦═════════╦═════════════════════╦═════════════════════╗ ║ name ║ name ║ Size ║ MaxSize ║ recovery_model_desc ║ log_reuse_wait_desc ║ ╠══════════════╬══════════════════╬══════╬═════════╬═════════════════════╬═════════════════════╣ ║ LogSpaceTest ║ LogSpaceTest_Log ║ 8000 ║ 8000 ║ SIMPLE ║ NOTHING ║ ╚══════════════╩══════════════════╩══════╩═════════╩═════════════════════╩═════════════════════╝ ╔══════════════╦═══════════╦═══════════╦════════════════╗ ║ DatabaseName ║ LogSizeMB ║ LogUsedMB ║ LogUsedPercent ║ ╠══════════════╬═══════════╬═══════════╬════════════════╣ ║ LogSpaceTest ║ 7999 ║ 35 ║ 0.44 ║ ╚══════════════╩═══════════╩═══════════╩════════════════╝
For the curious out there, I’ve left the various configuration settings for the checkpoint process at the defaults in order to mimic what is likely to be found “in the wild”. However, running the tests above with the following setting changes makes no observable change to the outcome.
EXEC sp_configure 'recovery interval','seconds';
ALTER DATABASE ... SET TARGET_RECOVERY_TIME = target_recovery_time { SECONDS | MINUTES };
Let me know if you enjoyed this post, or if you think it could be improved. Perhaps you might like the rest of our posts on troubleshooting.
1 – CHECKPOINT is of course important for the FULL and BULK LOGGED recovery models, too!