[SERVER-24300] Fatal Assertion after unclean shutdown Created: 26/May/16  Updated: 14/Jul/16  Resolved: 03/Jun/16

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 3.0.12
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Karim Zamani Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
Operating System: ALL
Participants:

 Description   

After an unclean shutdown, database aborts with Fatal Assertion 17441 when repair is attempted. Database starts up but aborts with certain queries.

Environment:

  • MongoDB v3.0.12 64bit build
  • Storage engine: MMAPv1
  • SLES SP3 64bit

Assertion backtrace:

2016-05-26T13:27:05.940-0400 I -        [initandlisten] Fatal Assertion 17441
2016-05-26T13:27:05.950-0400 I CONTROL  [initandlisten]
 0xfc5aa2 0xf62d39 0xf46f16 0xdaac56 0xdaac7b 0xdb42e4 0xdb4301 0xdb471e 0xdb4a44 0xdba76c 0xc4fcb9 0x84d761 0x81ab59 0x7f1091e46c36 0x84afe9
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"BC5AA2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"B62D39","s":"_ZN5mongo10logContextEPKc"},{"b":"400000","o":"B46F16","s":"_ZN5mongo13fass},{"b":"400000","o":"9AAC56"},{"b":"400000","o":"9AAC7B","s":"_ZNK5mongo17RecordStoreV1Base13getNextRecordEPNS_16OperationContextERKNS_7DiskLocE"},{"b":"400000","o":"9B42E4","s":"_ZN5moncordStoreV1Iterator14_getNextRecordERKNS_7DiskLocE"},{"b":"400000","o":"9B4301","s":"_ZN5mongo27CappedRecordStoreV1Iterator8nextLoopERKNS_7DiskLocE"},{"b":"400000","o":"9B471E","s":"_ZN5dRecordStoreV1Iterator13getNextCappedERKNS_7DiskLocE"},{"b":"400000","o":"9B4A44","s":"_ZN5mongo27CappedRecordStoreV1Iterator7getNextEv"},{"b":"400000","o":"9BA76C","s":"_ZN5mongo12MMAPVairDatabaseEPNS_16OperationContextERKSsbb"},{"b":"400000","o":"84FCB9","s":"_ZN5mongo14repairDatabaseEPNS_16OperationContextEPNS_13StorageEngineERKSsbb"},{"b":"400000","o":"44D761","s":"nitAndListenEi"},{"b":"400000","o":"41AB59","s":"main"},{"b":"7F1091E28000","o":"1EC36","s":"__libc_start_main"},{"b":"400000","o":"44AFE9"}],"processInfo":{ "mongodbVersion" : "3.0.12"," : "33934938e0e95d534cebbaff656cde916b9c3573", "uname" : { "sysname" : "Linux", "release" : "3.0.76-0.11-default", "version" : "#1 SMP Fri Jun 14 08:21:43 UTC 2013 (ccab990)", "machine"}, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "E5C98047C025512B028EAA6CEAFECCEF7ECFDC36" }, { "b" : "7FFF54145000", "elfType" : 3, "buildId" : "93B578C8C601CE790EB6F03859DC" }, { "b" : "7F1093030000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "3D8771817CC6D1D1E4B9767B9B65DD2E06A0E597" }, { "b" : "7F1092DDA000", "path" : "/usr/lib64/libselfType" : 3, "buildId" : "F303AF2DC415D507812C74A220858FFB808043FF" }, { "b" : "7F1092A3B000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "CE8025EA3EB87B82EB5784DDAA89" }, { "b" : "7F1092832000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "3AA2AAE918264415A943EE5EE2B872BB9C6194A2" }, { "b" : "7F109262E000", "path" : "/lib64/libdl.soe" : 3, "buildId" : "61F1824892113FA0CBAAA4C1831AD5B732E78525" }, { "b" : "7F10923B5000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "86E0C9994D16F010CB58CDD68CBEDF4899C67BA: "7F109219F000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "3B149ECCD897F1F37DCE50AD22614043EBA757A2" }, { "b" : "7F1091E28000", "path" : "/lib64/libc.so.6", "elfType"Id" : "89F460A6502702332C336F3CD7F5568036483B98" }, { "b" : "7F109324D000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "38AB807FCCA391AF7D3ED7FCF585FBFF2D54556A" 7F1091C12000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "BF68E74BB76519B8748D888D18B5D3B2C0B58593" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0xfc5aa2]
 mongod(_ZN5mongo10logContextEPKc+0xE9) [0xf62d39]
 mongod(_ZN5mongo13fassertFailedEi+0x66) [0xf46f16]
 mongod(+0x9AAC56) [0xdaac56]
 mongod(_ZNK5mongo17RecordStoreV1Base13getNextRecordEPNS_16OperationContextERKNS_7DiskLocE+0x1B) [0xdaac7b]
 mongod(_ZN5mongo27CappedRecordStoreV1Iterator14_getNextRecordERKNS_7DiskLocE+0x14) [0xdb42e4]
 mongod(_ZN5mongo27CappedRecordStoreV1Iterator8nextLoopERKNS_7DiskLocE+0x11) [0xdb4301]
 mongod(_ZN5mongo27CappedRecordStoreV1Iterator13getNextCappedERKNS_7DiskLocE+0x13E) [0xdb471e]
 mongod(_ZN5mongo27CappedRecordStoreV1Iterator7getNextEv+0xF4) [0xdb4a44]
 mongod(_ZN5mongo12MMAPV1Engine14repairDatabaseEPNS_16OperationContextERKSsbb+0xF4C) [0xdba76c]
 mongod(_ZN5mongo14repairDatabaseEPNS_16OperationContextEPNS_13StorageEngineERKSsbb+0xE69) [0xc4fcb9]
 mongod(_ZN5mongo13initAndListenEi+0x8A1) [0x84d761]
 mongod(main+0x159) [0x81ab59]
 libc.so.6(__libc_start_main+0xE6) [0x7f1091e46c36]
 mongod(+0x44AFE9) [0x84afe9]
-----  END BACKTRACE  -----
2016-05-26T13:27:05.950-0400 I -        [initandlisten]
 
***aborting after fassert() failure



 Comments   
Comment by Kelsey Schubert [ 03/Jun/16 ]

Hi kzamani,

You may benefit from reviewing our documentation on recovering data following an unclean shutdown. If mongod --repair is unable to to recover your data files, you may use mongodump --repair in your repair attempt which uses a more aggressive data-recovery algorithm. However, this procedure may produce a large amount of duplicated documents that would have to be manually resolved.

Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. The best place for this discussion would be on the mongodb-user group. If you have any additional questions, please follow up on your post there.

Thank you,
Thomas

Comment by Karim Zamani [ 27/May/16 ]

Currently the only option we have for resync'ing is to take a copy of /data/db while the database is active (which will result in the same issue). For our purposes, we are OK losing some data but we'd like the database to work. Is there way to repair (including discarding bad data) or force the database to ignore corrupt data? This will be greatly appreciated.

Comment by Ramon Fernandez Marina [ 27/May/16 ]

kzamani, it seems that the data in this node has become corrupted. The best way forward is to resync from a healthy node.

Generated at Thu Feb 08 04:05:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.