[SERVER-17713] WiredTiger using zlib compression can create invalid compressed stream Created: 24/Mar/15  Updated: 04/Jun/15  Resolved: 27/Mar/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.0, 3.0.1
Fix Version/s: 3.0.2, 3.1.1

Type: Bug Priority: Critical - P2
Reporter: Bruce Lucas (Inactive) Assignee: Keith Bostic (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-17654 Crash/Exception while performing init... Closed
is duplicated by SERVER-17930 Wired Tiger encountered an illegal fi... Closed
is duplicated by SERVER-17996 Zlib decompression on wired tiger Closed
is duplicated by SERVER-20986 ***aborting after fassert() failure Closed
Related
related to SERVER-17654 Crash/Exception while performing init... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Participants:

 Description   
Issue Status as of Apr 02, 2015

ISSUE SUMMARY
In some rare circumstances, WiredTiger configured with zlib compression can create a corrupted on-disk file. If you see the following messages in your log file, you may have encountered this error:

2015-03-24T09:27:19.605-0400 E STORAGE  [initandlisten] WiredTiger (0) [1427203639:605943][21310:0x7fe063080b80], file:collection-2--6089165247661965497.wt, cursor.prev: zlib error: inflate: data error: -3
2015-03-24T09:27:19.606-0400 E STORAGE  [initandlisten] WiredTiger (0) [1427203639:606093][21310:0x7fe063080b80], file:collection-2--6089165247661965497.wt, cursor.prev: file:collection-2--6089165247661965497.wt: encountered an illegal file format or internal value
2015-03-24T09:27:19.606-0400 E STORAGE  [initandlisten] WiredTiger (-31804) [1427203639:606114][21310:0x7fe063080b80], file:collection-2--6089165247661965497.wt, cursor.prev: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-03-24T09:27:19.606-0400 I -        [initandlisten] Fatal Assertion 28558

USER IMPACT
mongod may terminate when it subsequently accesses the corrupted block, or may return corrupted data to a query.

WORKAROUNDS
Use snappy compression or upgrade to 3.0.2.

AFFECTED VERSIONS
3.0.0 and 3.0.1

FIX VERSION
The fix is included in the 3.0.2 production release.

Original description

Under certain conditions WiredTiger using zlib compression creates an invalid and unrecoverable compressed stream, resulting in the following fatal error on subsequent access:

2015-03-24T09:27:19.103-0400 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=5G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2015-03-24T09:27:19.605-0400 E STORAGE  [initandlisten] WiredTiger (0) [1427203639:605943][21310:0x7fe063080b80], file:collection-2--6089165247661965497.wt, cursor.prev: zlib error: inflate: data error: -3
2015-03-24T09:27:19.606-0400 E STORAGE  [initandlisten] WiredTiger (0) [1427203639:606093][21310:0x7fe063080b80], file:collection-2--6089165247661965497.wt, cursor.prev: file:collection-2--6089165247661965497.wt: encountered an illegal file format or internal value
2015-03-24T09:27:19.606-0400 E STORAGE  [initandlisten] WiredTiger (-31804) [1427203639:606114][21310:0x7fe063080b80], file:collection-2--6089165247661965497.wt, cursor.prev: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-03-24T09:27:19.606-0400 I -        [initandlisten] Fatal Assertion 28558
2015-03-24T09:27:19.616-0400 I CONTROL  [initandlisten] 
 0xf4fe49 0xefa091 0xeddc81 0xd790ea 0x1380900 0x1380bc5 0x1381064 0x12f0fa7 0x12f5485 0x12f2823 0x1306424 0x12e0e7f 0x1322c19 0xd6794c 0xd67a42 0xd6819a 0xd61e42 0xce22b6 0xce53ec 0xd60cb6 0xa6f9cd 0x7e20c0 0x7e7704 0x7fe061c7efe0 0x7e02c9
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B4FE49"},{"b":"400000","o":"AFA091"},{"b":"400000","o":"ADDC81"},{"b":"400000","o":"9790EA"},{"b":"400000","o":"F80900"},{"b":"400000","o":"F80BC5"},{"b":"400000","o":"F81064"},{"b":"400000","o":"EF0FA7"},{"b":"400000","o":"EF5485"},{"b":"400000","o":"EF2823"},{"b":"400000","o":"F06424"},{"b":"400000","o":"EE0E7F"},{"b":"400000","o":"F22C19"},{"b":"400000","o":"96794C"},{"b":"400000","o":"967A42"},{"b":"400000","o":"96819A"},{"b":"400000","o":"961E42"},{"b":"400000","o":"8E22B6"},{"b":"400000","o":"8E53EC"},{"b":"400000","o":"960CB6"},{"b":"400000","o":"66F9CD"},{"b":"400000","o":"3E20C0"},{"b":"400000","o":"3E7704"},{"b":"7FE061C5F000","o":"1FFE0"},{"b":"400000","o":"3E02C9"}],"processInfo":{ "mongodbVersion" : "3.0.1", "gitVersion" : "534b5a3f9d10f00cd27737fbcd951032248b5952", "uname" : { "sysname" : "Linux", "release" : "3.17.4-301.fc21.x86_64", "version" : "#1 SMP Thu Nov 27 19:09:10 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFF371FE000", "elfType" : 3 }, { "b" : "7FE062C56000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "7FE062A4E000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "7FE06284A000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "7FE06253B000", "path" : "/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "7FE062233000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "7FE06201C000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7FE061C5F000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "7FE062E72000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf4fe49]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xefa091]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xeddc81]
 mongod(+0x9790EA) [0xd790ea]
 mongod(+0xF80900) [0x1380900]
 mongod(__wt_err+0x95) [0x1380bc5]
 mongod(__wt_panic+0x24) [0x1381064]
 mongod(__wt_bt_read+0x437) [0x12f0fa7]
 mongod(__wt_cache_read+0x1C5) [0x12f5485]
 mongod(__wt_page_in_func+0x403) [0x12f2823]
 mongod(__wt_tree_walk+0x594) [0x1306424]
 mongod(__wt_btcur_prev+0xB4F) [0x12e0e7f]
 mongod(+0xF22C19) [0x1322c19]
 mongod(_ZN5mongo21WiredTigerRecordStore8Iterator8_getNextEv+0x2C) [0xd6794c]
 mongod(_ZN5mongo21WiredTigerRecordStore8Iterator7getNextEv+0x12) [0xd67a42]
 mongod(_ZN5mongo21WiredTigerRecordStoreC1EPNS_16OperationContextERKNS_10StringDataES5_bllPNS_28CappedDocumentDeleteCallbackEPNS_20WiredTigerSizeStorerE+0x46A) [0xd6819a]
 mongod(_ZN5mongo18WiredTigerKVEngine14getRecordStoreEPNS_16OperationContextERKNS_10StringDataES5_RKNS_17CollectionOptionsE+0x132) [0xd61e42]
 mongod(_ZN5mongo22KVDatabaseCatalogEntry14initCollectionEPNS_16OperationContextERKSsb+0x276) [0xce22b6]
 mongod(_ZN5mongo15KVStorageEngineC1EPNS_8KVEngineERKNS_22KVStorageEngineOptionsE+0x69C) [0xce53ec]
 mongod(+0x960CB6) [0xd60cb6]
 mongod(_ZN5mongo23GlobalEnvironmentMongoD22setGlobalStorageEngineERKSs+0x30D) [0xa6f9cd]
 mongod(_ZN5mongo13initAndListenEi+0x2F0) [0x7e20c0]
 mongod(main+0x134) [0x7e7704]
 libc.so.6(__libc_start_main+0xF0) [0x7fe061c7efe0]
 mongod(+0x3E02C9) [0x7e02c9]
-----  END BACKTRACE  -----
2015-03-24T09:27:19.617-0400 I -        [initandlisten] 
 
***aborting after fassert() failure



 Comments   
Comment by Githook User [ 03/Apr/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@wiredtiger.com'}

Message: Use deflateCopy to copy streams for rollback in case the compressed size is too large.

refs SERVER-17713
Branch: validate-configuration-string
https://github.com/wiredtiger/wiredtiger/commit/4c0881afeb6713ef7ae9ea2b8f61811b0fecd192

Comment by Keith Bostic (Inactive) [ 28/Mar/15 ]

Agreed, this could cause undetected corruption at pretty much any time the object is being written.

It's hard to say why these users hit this issue, but my belief is it's data dependent, that is, a particular set of data will trigger the failure. My guess is it's large data items (large, that is, with respect to the configured block size).

bruce.lucas@10gen.com' analysis indicates there's only a few corrupted bytes and they're in the zlib header (not in the data itself), so we could probably figure out how to overwrite the particular corrupted bytes with correct ones, but nobody has investigated that as far as I know.

Comment by Michael Cahill (Inactive) [ 27/Mar/15 ]

Resolved with latest drop from WT.

Generated at Thu Feb 08 03:45:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.