[SERVER-19258] Unexpected wiredTiger assertion error prevented database from starting. Created: 02/Jul/15  Updated: 03/Jul/15  Resolved: 03/Jul/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Adrian Ludwig Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

CentOS


Issue Links:
Duplicate
duplicates SERVER-18316 Database with WT engine fails to reco... Closed
Operating System: Linux
Steps To Reproduce:

Don't know how it happen.

Participants:

 Description   

2015-07-02T05:21:02.970+0000 I REPL     [ReplicationExecutor] syncing from: 10.10.10.2:27017
2015-07-02T05:21:02.971+0000 I REPL     [SyncSourceFeedback] replset setting syncSourceFeedback to 10.10.10.2:27017
2015-07-02T05:21:02.972+0000 I REPL     [rsBackgroundSync] replSet our last op time fetched: Jul  2 00:29:37:1
2015-07-02T05:21:02.972+0000 I REPL     [rsBackgroundSync] replset source's GTE: Jul  2 00:29:41:1
2015-07-02T05:21:02.972+0000 F REPL     [rsBackgroundSync] replSet need to rollback, but in inconsistent state
2015-07-02T05:21:02.972+0000 I REPL     [rsBackgroundSync] minvalid: 559485f3:1 our last optime: 559485f1:1
2015-07-02T05:21:02.972+0000 I -        [rsBackgroundSync] Fatal Assertion 18750
2015-07-02T05:21:02.972+0000 I -        [rsBackgroundSync]
 
***aborting after fassert() failure
 
 
2015-07-02T05:31:02.973+0000 I CONTROL  ***** SERVER RESTARTED *****
2015-07-02T05:31:03.424+0000 W -        [initandlisten] Detected unclean shutdown - /var/lib/mongo/mongod.lock is not empty.
2015-07-02T05:31:03.424+0000 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2015-07-02T05:31:03.424+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=48G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2015-07-02T05:31:36.175+0000 I CONTROL  ***** SERVER RESTARTED *****
2015-07-02T05:31:37.222+0000 W -        [initandlisten] Detected unclean shutdown - /var/lib/mongo/mongod.lock is not empty.
2015-07-02T05:31:37.222+0000 I STORAGE  [initandlisten] exception in initAndListen: 98 Unable to lock file: /var/lib/mongo/mongod.lock errno:11 Resource temporarily unavailable. Is a mongod instance already running?, terminating
2015-07-02T05:31:37.222+0000 I CONTROL  [initandlisten] dbexit:  rc: 100
2015-07-02T05:32:02.971+0000 E STORAGE  [initandlisten] WiredTiger (0) [1435815122:971662][29577:0x7f592262cc20], file:collection-31--6571499968748477119.wt, session.open_cursor: read checksum error [4096B @ 24576, 2528095105 != 3344792778]
2015-07-02T05:32:02.971+0000 E STORAGE  [initandlisten] WiredTiger (0) [1435815122:971708][29577:0x7f592262cc20], file:collection-31--6571499968748477119.wt, session.open_cursor: collection-31--6571499968748477119.wt: encountered an illegal file format or internal value
2015-07-02T05:32:02.971+0000 E STORAGE  [initandlisten] WiredTiger (-31804) [1435815122:971718][29577:0x7f592262cc20], file:collection-31--6571499968748477119.wt, session.open_cursor: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-07-02T05:32:02.971+0000 I -        [initandlisten] Fatal Assertion 28558
2015-07-02T05:32:03.391+0000 I CONTROL  [initandlisten]
 0xf77369 0xf140f1 0xef7e41 0xd95afa 0x13ab029 0x13ab1e5 0x13ab684 0x12fda3e 0x12fded8 0x12fb263 0x12fec06 0x1317ec1 0x134124b 0x13aa4a0 0x13aa65a 0x134ea81 0x13a85ca 0x1360113 0x13a8411 0x13a8825 0xd90b16 0xd8a12b 0xd83f34 0xd848a3 0xd84ba9 0xd7e9f2 0xcfe746 0xd017ac 0xd7d7f6 0xa7eddd 0x7f5722 0x7fa6e9 0x7f5920c92d5d 0x7f34c9
>
> ----- BEGIN BACKTRACE -----
> {"backtrace":[{"b":"400000","o":"B77369"},{"b":"400000","o":"B140F1"},{"b":"400000","o":"AF7E41"},{"b":"400000","o":"995AFA"},{"b":"400000","o":"FAB029"},{"b":"400000","o":"FAB1E5"},{"b":"400000","o":"FAB684"},{"b":"400000","o":"EFDA3E"},{"b":"400000","o":"EFDED8"},{"b":"400000","o":"EFB263"},{"b":"400000","o":"EFEC06"},{"b":"400000","o":"F17EC1"},{"b":"400000","o":"F4124B"},{"b":"400000","o":"FAA4A0"},{"b":"400000","o":"FAA65A"},{"b":"400000","o":"F4EA81"},{"b":"400000","o":"FA85CA"},{"b":"400000","o":"F60113"},{"b":"400000","o":"FA8411"},{"b":"400000","o":"FA8825"},{"b":"400000","o":"990B16"},{"b":"400000","o":"98A12B"},{"b":"400000","o":"983F34"},{"b":"400000","o":"9848A3"},{"b":"400000","o":"984BA9"},{"b":"400000","o":"97E9F2"},{"b":"400000","o":"8FE746"},{"b":"400000","o":"9017AC"},{"b":"400000","o":"97D7F6"},{"b":"400000","o":"67EDDD"},{"b":"400000","o":"3F5722"},{"b":"400000","o":"3FA6E9"},{"b":"7F5920C74000","o":"1ED5D"},{"b":"400000","o":"3F34C9"}],"processInfo":{ "mongodbVersion" : "3.0.4", "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5", "uname" : { "sysname" : "Linux", "release" : "2.6.32-504.12.2.el6.x86_64", "version" : "#1 SMP Wed Mar 11 22:03:14 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "7702B4740E91E1BA6F701DB2A42553968AC09562" }, { "b" : "7FFFA5C9A000", "elfType" : 3, "buildId" : "4B8E1260CCD7C3D8CC131E84001F7220651617EC" }, { "b" : "7F5922203000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "B8DFF8E53D9F2B80C3C382E83EC17C828B536A39" }, { "b" : "7F5921F97000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "DAF114120DA5C9DBEB1E5A704CE83ACB9B8B7B54" }, { "b" : "7F5921BB4000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "F523EAC46D068A8E0869CF93BCD84B414937993A" }, { "b" : "7F59219AC000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "583411D8786F86A1D6B8741C502831E6122445A7" }, { "b" : "7F59217A8000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "454F8FC6CC6502C6401E5F9E221564D80665D277" }, { "b" : "7F59214A2000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "F07F2E7CF4BFB393CC9BBE8CDC6463652E14DB07" }, { "b" : "7F592121E000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "7D8E9374F4A4EA38A7C1E763F32240EA113E4208" }, { "b" : "7F5921008000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "246C3BAB0AB093AFD59D34C8CBF29E786DE4BE97" }, { "b" : "7F5920C74000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "E4EAB3C200B7D8444FF95AB01F6466924A6A5F5F" }, { "b" : "7F5922420000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "6F8E59B70E469F3A924A268911FF8FD0C37E7460" }, { "b" : "7F5920A30000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "B7F7FF323B3A4A12310A6285412F01ACE8C74E47" }, { "b" : "7F592074A000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "7920917F74AFAD0B8CB197CABBE472AF39D94C34" }, { "b" : "7F5920546000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "8CE28F280150E62296240E70ECAC64E4A57AB826" }, { "b" : "7F592031A000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "05733977F4E41652B86070B27A0CFC2C1EA7719D" }, { "b" : "7F5920104000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "5FA8E5038EC04A774AF72A9BB62DC86E1049C4D6" }, { "b" : "7F591FEF9000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "C8D01C2839F6950988CE32B4266A8F89C521ACB0" }, { "b" : "7F591FCF6000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "AF374BAFB7F5B139A0B431D3F06D82014AFF3251" }, { "b" : "7F591FADC000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "F8B68F301C19BF06AF56B4B06E0A69F89D2C1F8D" }, { "b" : "7F591F8BD000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "E6798A06BEE17CF102BBA44FD512FF8B805CEAF1" } ] }}
>  mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf77369]
>  mongod(_ZN5mongo10logContextEPKc+0xE1) [0xf140f1]
>  mongod(_ZN5mongo13fassertFailedEi+0x61) [0xef7e41]
>  mongod(+0x995AFA) [0xd95afa]
>  mongod(__wt_eventv+0x489) [0x13ab029]
>  mongod(__wt_err+0x95) [0x13ab1e5]
>  mongod(__wt_panic+0x24) [0x13ab684]
>  mongod(__wt_block_extlist_read+0x6E) [0x12fda3e]
>  mongod(__wt_block_extlist_read_avail+0x28) [0x12fded8]
>  mongod(__wt_block_checkpoint_load+0x193) [0x12fb263]
>  mongod(+0xEFEC06) [0x12fec06]
>  mongod(__wt_btree_open+0xAB1) [0x1317ec1]
>  mongod(__wt_conn_btree_get+0x19B) [0x134124b]
>  mongod(__wt_session_get_btree+0x500) [0x13aa4a0]
>  mongod(__wt_session_get_btree_ckpt+0xBA) [0x13aa65a]
>  mongod(__wt_curfile_open+0xE1) [0x134ea81]
>  mongod(__wt_open_cursor+0x26A) [0x13a85ca]
>  mongod(__wt_curtable_open+0x2E3) [0x1360113]
>  mongod(__wt_open_cursor+0xB1) [0x13a8411]
>  mongod(+0xFA8825) [0x13a8825]
>  mongod(_ZN5mongo17WiredTigerSession9getCursorERKSsmb+0x156) [0xd90b16]
>  mongod(_ZN5mongo16WiredTigerCursorC1ERKSsmbPNS_16OperationContextE+0x4B) [0xd8a12b]
>  mongod(_ZN5mongo21WiredTigerRecordStore8IteratorC2ERKS0_PNS_16OperationContextERKNS_8RecordIdERKNS_20CollectionScanParams9DirectionEb+0x64) [0xd83f34]
>  mongod(_ZNK5mongo21WiredTigerRecordStore11getIteratorEPNS_16OperationContextERKNS_8RecordIdERKNS_20CollectionScanParams9DirectionE+0x43) [0xd848a3]
>  mongod(_ZN5mongo21WiredTigerRecordStoreC1EPNS_16OperationContextERKNS_10StringDataES5_bllPNS_28CappedDocumentDeleteCallbackEPNS_20WiredTigerSizeStorerE+0x2B9) [0xd84ba9]
>  mongod(_ZN5mongo18WiredTigerKVEngine14getRecordStoreEPNS_16OperationContextERKNS_10StringDataES5_RKNS_17CollectionOptionsE+0x132) [0xd7e9f2]
>  mongod(_ZN5mongo22KVDatabaseCatalogEntry14initCollectionEPNS_16OperationContextERKSsb+0x276) [0xcfe746]
>  mongod(_ZN5mongo15KVStorageEngineC1EPNS_8KVEngineERKNS_22KVStorageEngineOptionsE+0x69C) [0xd017ac]
>  mongod(+0x97D7F6) [0xd7d7f6]
>  mongod(_ZN5mongo23GlobalEnvironmentMongoD22setGlobalStorageEngineERKSs+0x30D) [0xa7eddd]
>  mongod(_ZN5mongo13initAndListenEi+0x422) [0x7f5722]
>  mongod(main+0x139) [0x7fa6e9]
>  libc.so.6(__libc_start_main+0xFD) [0x7f5920c92d5d]
>  mongod(+0x3F34C9) [0x7f34c9]
> -----  END BACKTRACE  -----
> 2015-07-02T05:32:03.391+0000 I -        [initandlisten]
>
> ***aborting after fassert() failure

After this error database was down and we could not start it again. The only solution was to clear all data files and resync everything.

The same error was showing up in every attempt to start database.



 Comments   
Comment by Ramon Fernandez Marina [ 03/Jul/15 ]

adrian.ludwig, please see the documentation on how to proceed. If you have a replica set the recommendation is to resync from a healthy node.

As Michael pointed out there's a bug in versions prior to 3.0.4 that triggers this situation, so I'd encourage you to upgrade to 3.0.4 as soon as possible – specially if unclean shutdowns are a possibility, whether they're triggered by a silicon-based bug or a carbon-based bug

I'm going to mark this ticket as a duplicate of SERVER-18316 and close it, but feel free to post updates.

Regards,
Ramón.

Comment by Adrian Ludwig [ 03/Jul/15 ]

Hi!
Thank you for reply!
It is possible that our "kamikaze" sys admin did something with the mongodb server.

So as I understand, resync of secondary or --repair should fix the situation ?

Comment by Michael Cahill (Inactive) [ 02/Jul/15 ]

Hi adrian.ludwig, I'm sorry that you have had this problem using MongoDB with WiredTiger.

Is it possible that there was an unclean shutdown of MongoDB while running an earlier version? There was a bug (SERVER-18316) that could cause this kind of error that was fixed in MongoDB 3.0.4. However, if the database had been corrupted by an unclean shutdown in an earlier version, the corruption may not have been detected immediately.

Running mongod --repair should also have recovered from this situation.

Generated at Thu Feb 08 03:50:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.