Msg 824, Level 24, State 2, SQL Server detected a logical consistency-based I/O error: invalid protection option
Data in SQL Server is stored in 8KB pages, both in memory and on-disk. Corruption in those pages are detected using one of several different methods, configurable at the database level. The best option, and the one that should be used by default for essentially every database is “checksum protection”. Checksum protection calculates a checksum over the data stored in each page, and records that checksum in the page header. Configure the PAGE_VERIFY
option to CHECKSUM
via an ALTER DATABASE ... SET ...
command, as shown by Microsoft here. Whenever SQL Server reads a page, it confirms the checksum stored in the page header matches the checksum calculated against the data stored on the page.
The Errors
If the calculated and stored values don’t match, a Severity 24 error is raised:
Msg 824, Level 24, State 2, Line 89 SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x602f8f73; actual: 0x302e8f73). It occurred during a read of page (1:328) in database ID 15 at offset 0x00000000290000 in file 'C:\data\bit_flip_test_system.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
You may receive the following slightly different severity-24 error message, indicating “invalid protection option”:
Msg 824, Level 24, State 2, Line 89 SQL Server detected a logical consistency-based I/O error: invalid protection option. It occurred during a read of page (1:328) in database ID 15 at offset 0x00000000290000 in file 'C:\data\bit_flip_test_system.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
I ran into this error on a production machine recently, and asked a question on dba.stackexchange.com to get clarification about what “invalid protection option” actually meant. Sean Gallardy provided a great answer that ended up being the catalyst for this blog post. Thanks, Sean!
Props to Mr. Randal
Paul Randal has an interesting article where he mentions the valid page-protection values for SQL Server are stored in the m_flagBits
header value:
There are two bits in the page header that specify whether the page is protected by torn-page detection or with a page checksum. Specifically, the m_flagBits field will have 0x100 set if the page is encoded for torn-page protection, and 0x200 set if the page has a page-checksum stored on it, and the page has not been modified (i.e. the checksum is still valid). You should not see the 0x100 bit set as torn-page encoding is removed when the page is read into the buffer pool – UNLESS the page IS actually torn, in which case the encoding is NOT removed.
Meat-and-Potatoes
Since I like my blog posts to be evidence-based, lets see if we can cause the above error messages on a test database. First thing to do is create a test database. The code below will drop any database named bit_flip_test
, so make sure you don’t have a database named that already!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
USE master; GO IF DB_ID(N'bit_flip_test') IS NOT NULL BEGIN ALTER DATABASE bit_flip_test SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE bit_flip_test; END CREATE DATABASE bit_flip_test ON ( NAME = N'bit_flip_test_system' , FILENAME = N'C:\data\bit_flip_test_system.mdf' , SIZE = 10MB , MAXSIZE = 10MB , FILEGROWTH = 1MB ) LOG ON ( NAME = N'bit_flip_test_log' , FILENAME = N'C:\data\bit_flip_test_log.ldf' , SIZE = 10MB , MAXSIZE = 10MB , FILEGROWTH = 1MB ); |
After the database is created, we’ll set the recovery model to simple, and the PAGE_VERIFY
option to CHECKSUM
:
1 2 3 4 5 6 7 8 |
ALTER DATABASE bit_flip_test SET RECOVERY SIMPLE; ALTER DATABASE bit_flip_test SET PAGE_VERIFY CHECKSUM; BACKUP DATABASE bit_flip_test TO DISK = N'NUL:'; |
We’re backing the disk up to the NUL:
device in the above example since the recovery model change only takes effect after a full backup is completed. We’re using the NUL:
device in the backup statement since we don’t actually need a real backup – never do that kind of backup in a production environment!
This next bit of code will create a table inside the new database, and add a single row of data to the table:
1 2 3 4 5 6 7 8 9 10 |
USE bit_flip_test; CREATE TABLE dbo.bft ( col char(8000) NOT NULL CONSTRAINT bft_col_df DEFAULT (REPLICATE(char(255), 8000)) ); INSERT INTO dbo.bft DEFAULT VALUES; |
The col
column contains 8,000 bytes of the repeating value ff
in hexadecimal, which makes it easy to see in the DBCC PAGE
output. This helps visually separate the page header from the page data.
The following code uses DBCC PAGE
to output hexadecimal page dumps of the single page used to store the single row from the dbo.bft
table. DBCC TRACEON(3604)
is necessary to redirect output to the client console. Normally, that output would be sent to the SQL Server Error Log.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
DBCC TRACEON(3604) WITH NO_INFOMSGS; DECLARE @dbid int = DB_ID(); DECLARE @fileid int; DECLARE @pageid int; DECLARE cur CURSOR LOCAL FORWARD_ONLY STATIC READ_ONLY FOR SELECT dpa.allocated_page_file_id , dpa.allocated_page_page_id FROM sys.schemas s INNER JOIN sys.objects o ON o.schema_id = s.schema_id CROSS APPLY sys.dm_db_database_page_allocations(DB_ID(), o.object_id, NULL, NULL, 'DETAILED') dpa WHERE o.name = N'bft' AND s.name = N'dbo' AND dpa.page_type_desc = N'DATA_PAGE'; OPEN cur; FETCH NEXT FROM cur INTO @fileid, @pageid; WHILE @@FETCH_STATUS = 0 BEGIN DBCC PAGE (@dbid, @fileid, @pageid, 2); FETCH NEXT FROM cur INTO @fileid, @pageid; END CLOSE cur; DEALLOCATE cur; DBCC TRACEOFF(3604); GO |
The code above is borrowed from my post using DBCC PAGE to view row details. The DBCC PAGE
command in this post is using output style 2, which includes the full page hex dump. The output from running the above code is shown below, with a bunch of repeating data omitted for brevity. The bits highlighted in red are the m_flagBits
header value, which is were the PAGE_VERIFY
bits live.
Corrupt Away!
In the output above we see the page number is (1:328)
, this indicates file #1, page #328. We’ll need the file and page number to intentionally corrupt the page using the undocumented DBCC WRITEPAGE
command. We’ll use that command to modify just the m_flagBits
value in the header to see what effects specific changes have. In order to write to the page we need to set the database into single user mode. This allows DBCC WRITEPAGE
exclusive access to Windows’ file-descriptor-block for the database file(s). That enables writes to the data without SQL Server executing its built-in error detection code. We’re effectively mimicking disk corruption with this technique, even though it’s a very specific corruption. The first parameter for DBCC WRITEPAGE
is the database ID. On my system, the database ID is 15; you’ll need to modify that in the code below to suit your system. Don’t blindly run this against your system since it WILL corrupt database ID 15!
1 2 3 4 5 6 7 8 9 10 11 12 13 |
USE master; ALTER DATABASE bit_flip_test SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DBCC TRACEON (3604); DBCC WRITEPAGE (15, 1, 328, 4, 2, 0x0222, 1); --0x0222 is an invalid set of flags DBCC PAGE (15, 1, 328, 2); DBCC TRACEOFF (3604); ALTER DATABASE bit_flip_test SET MULTI_USER; GO |
The output from the above code:
As you can see, the m_flagBits
values is changed from 0x8000 to 0x2202. Attempting to read that page into the buffer pool will cause SQL Server to recognize corruption on the page, and report it with an error, as shown by this code:
1 2 3 4 5 |
USE bit_flip_test; GO SELECT * FROM dbo.bft; |
Msg 824, Level 24, State 2, Line 86 SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x602f8f73; actual: 0x302e8f73). It occurred during a read of page (1:328) in database ID 15 at offset 0x00000000290000 in file 'C:\data\bit_flip_test_system.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
For “fun”, let’s run DBCC CHECKDB
against the database:
1 |
DBCC CHECKDB (bit_flip_test) WITH NO_INFOMSGS, ALL_ERRORMSGS; |
Output looks like:
For this next test, you’ll need to re-create the database using the code at the top of the post. Once the database has been recreated, let’s corrupt the m_flagBits
value to 0x0083
.
1 2 3 4 5 6 7 8 9 10 11 |
USE master; ALTER DATABASE bit_flip_test SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DBCC TRACEON (3604); DBCC WRITEPAGE (15, 1, 328, 4, 2, 0x0083, 1); --0x0083 - invalid protections option DBCC PAGE (15, 1, 328, 2); DBCC TRACEOFF (3604); ALTER DATABASE bit_flip_test SET MULTI_USER; |
The DBCC PAGE
output looks like:
Let’s run a SELECT
statement against the table:
1 2 3 4 |
USE bit_flip_test; GO SELECT * FROM dbo.bft; |
Now, we get the “invalid protection option” error, confirming that specific values in the m_bitFlags
header field do indeed cause that error:
Msg 824, Level 24, State 2, Line 86 SQL Server detected a logical consistency-based I/O error: invalid protection option. It occurred during a read of page (1:328) in database ID 15 at offset 0x00000000290000 in file 'C:\data\bit_flip_test_system.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Summary
Page corruption can take many forms; the “invalid protection option” error message detailed in this post comes from corruption to a very specific part of the page, the m_flagBits
field. If you get corruption reporting this error, you may be able to fix the page using DBCC WRITEPAGE
by adjusting the m_flagBits
field to a valid value.
When SQL Server detects corruption on a database, various details are recorded in the msdb
database, in the dbo.suspect_pages
table. Check out our post on suspect_pages for important details about that table, including required maintenance!
I hope you enjoyed this post, and please take a look at the rest of our SQL Server Internals posts!