[SERVER-27394] mongodb crash! Created: 13/Dec/16  Updated: 16/Dec/16  Resolved: 16/Dec/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.4.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Oleg Trifonov Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

2016-12-13T13:46:32.739+0300 E STORAGE  [TTLMonitor] WiredTiger error (0) [1481625992:739037][11026:0x7f48f45a1700], file:index-86--4095658065937658735.wt, WT_CURSOR.remove: read checksum error for 12288B block at offset 1547100160: block header checksum of 1781300651 doesn't match expected checksum of 1690850977
2016-12-13T13:46:32.739+0300 E STORAGE  [TTLMonitor] WiredTiger error (0) [1481625992:739112][11026:0x7f48f45a1700], file:index-86--4095658065937658735.wt, WT_CURSOR.remove: index-86--4095658065937658735.wt: encountered an illegal file format or internal value
2016-12-13T13:46:32.739+0300 E STORAGE  [TTLMonitor] WiredTiger error (-31804) [1481625992:739133][11026:0x7f48f45a1700], file:index-86--4095658065937658735.wt, WT_CURSOR.remove: the process must exit and restart: WT_PANIC: WiredTiger library panic
2016-12-13T13:46:32.739+0300 I -        [TTLMonitor] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361
2016-12-13T13:46:32.739+0300 I -        [TTLMonitor]
 
***aborting after fassert() failure
 
 
2016-12-13T13:46:32.770+0300 F -        [TTLMonitor] Got signal: 6 (Aborted).
 
 0x7f48fe8dd9b1 0x7f48fe8dcaa9 0x7f48fe8dcf8d 0x7f48fbfb4370 0x7f48fbc191d7 0x7f48fbc1a8c8 0x7f48fdb725ab 0x7f48fe5f76e6 0x7f48fdb7c948 0x7f48fdb7ca3c 0x7f48fdb7cc94 0x7f48ff1e30b5 0x7f48ff200a0d 0x7f48ff204cb8 0x7f48ff224dfd 0x7f48ff1f63cb 0x7f48ff243e62 0x7f48fe5d200d 0x7f48fe5c8a46 0x7f48fdfca67a 0x7f48fdfccbf3 0x7f48fdd72291 0x7f48fdd72666 0x7f48fdd4648f 0x7f48fdee8b3e 0x7f48fdf0e2f3 0x7f48fe21103a 0x7f48fe21195b 0x7f48fe211a8d 0x7f48fe5ffe9c 0x7f48fe601260 0x7f48fe6018f8 0x7f48fe84d04d 0x7f48ff352860 0x7f48fbfacdc5 0x7f48fbcdb73d
 
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"7F48FD35F000","o":"157E9B1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F48FD35F000","o":"157DAA9"},{"b":"7F48FD35F000","o":"157DF8D"},{"b":"7F48FBFA5000","o":"F370"},{"b":"7F48FBBE4000","o":"351D7","s":"gsignal"},{"b":"7F48FBBE4000","o":"368C8","s":"abort"},{"b":"7F48FD35F000","o":"8135AB","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"7F48FD35F000","o":"12986E6"},{"b":"7F48FD35F000","o":"81D948","s":"__wt_eventv"},{"b":"7F48FD35F000","o":"81DA3C","s":"__wt_err"},{"b":"7F48FD35F000","o":"81DC94","s":"__wt_panic"},{"b":"7F48FD35F000","o":"1E840B5","s":"__wt_bm_read"},{"b":"7F48FD35F000","o":"1EA1A0D","s":"__wt_bt_read"},{"b":"7F48FD35F000","o":"1EA5CB8","s":"__wt_page_in_func"},{"b":"7F48FD35F000","o":"1EC5DFD","s":"__wt_row_search"},{"b":"7F48FD35F000","o":"1E973CB","s":"__wt_btcur_remove"},{"b":"7F48FD35F000","o":"1EE4E62"},{"b":"7F48FD35F000","o":"127300D","s":"_ZN5mongo21WiredTigerIndexUnique8_unindexEP11__wt_cursorRKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"7F48FD35F000","o":"1269A46","s":"_ZN5mongo15WiredTigerIndex7unindexEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"7F48FD35F000","o":"C6B67A","s":"_ZN5mongo17IndexAccessMethod12removeOneKeyEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"7F48FD35F000","o":"C6DBF3","s":"_ZN5mongo17IndexAccessMethod6removeEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPl"},{"b":"7F48FD35F000","o":"A13291","s":"_ZN5mongo12IndexCatalog14_unindexRecordEPNS_16OperationContextEPNS_17IndexCatalogEntryERKNS_7BSONObjERKNS_8RecordIdEbPl"},{"b":"7F48FD35F000","o":"A13666","s":"_ZN5mongo12IndexCatalog13unindexRecordEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEbPl"},{"b":"7F48FD35F000","o":"9E748F","s":"_ZN5mongo10Collection14deleteDocumentEPNS_16OperationContextERKNS_8RecordIdEPNS_7OpDebugEbb"},{"b":"7F48FD35F000","o":"B89B3E","s":"_ZN5mongo11DeleteStage6doWorkEPm"},{"b":"7F48FD35F000","o":"BAF2F3","s":"_ZN5mongo9PlanStage4workEPm"},{"b":"7F48FD35F000","o":"EB203A","s":"_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE"},{"b":"7F48FD35F000","o":"EB295B","s":"_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE"},{"b":"7F48FD35F000","o":"EB2A8D","s":"_ZN5mongo12PlanExecutor11executePlanEv"},{"b":"7F48FD35F000","o":"12A0E9C","s":"_ZN5mongo10TTLMonitor13doTTLForIndexEPNS_16OperationContextENS_7BSONObjE"},{"b":"7F48FD35F000","o":"12A2260","s":"_ZN5mongo10TTLMonitor9doTTLPassEv"},{"b":"7F48FD35F000","o":"12A28F8","s":"_ZN5mongo10TTLMonitor3runEv"},{"b":"7F48FD35F000","o":"14EE04D","s":"_ZN5mongo13BackgroundJob7jobBodyEv"},{"b":"7F48FD35F000","o":"1FF3860"},{"b":"7F48FBFA5000","o":"7DC5"},{"b":"7F48FBBE4000","o":"F773D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.0", "gitVersion" : "f4240c60f005be757399042dc12f6addbc3170c1", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-514.2.2.el7.x86_64", "version" : "#1 SMP Tue Dec 6 23:06:41 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "b" : "7F48FD35F000", "elfType" : 3, "buildId" : "ACE5CADA1313A0B04B71DBBEB60CC944FA9ACDD6" }, { "b" : "7FFC6BFFD000", "elfType" : 3, "buildId" : "183CE4B56A9471419F233CCEF078E0504837ABF5" }, { "b" : "7F48FCECF000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "D0018CA5E24522ED0DC1844556FA8DBC4B39D5C3" }, { "b" : "7F48FCAE5000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "8756D2315BF50F8610875B1AFF128198FB9D202D" }, { "b" : "7F48FC8DD000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "82E77ADE22BC9FFF8D3458BD37331E7EDF174C28" }, { "b" : "7F48FC6D9000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "C5F560504E1AF52E29679C3B52FF11121015D6BB" }, { "b" : "7F48FC3D7000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "721C7CC9488EFA25F83B48AF713AB27DBE48EF3E" }, { "b" : "7F48FC1C1000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "408B46E291B2D4C9612E27C0509D165D7E186D40" }, { "b" : "7F48FBFA5000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "C3DEB1FA27CD0C1C3CC575B944ABACBA0698B0F2" }, { "b" : "7F48FBBE4000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8B2C421716985B927AA0CAF2A05D0B1F452367F7" }, { "b" : "7F48FD13D000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8F3E366E2DB73C330A3791DEAE31AE9579099B44" }, { "b" : "7F48FB996000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "A2499C359AA179EE23324ED949C0E508E4434F10" }, { "b" : "7F48FB6AF000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "E09A34D9083DC6FEAF7018C09D55631DEEE2836D" }, { "b" : "7F48FB4AB000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "BF54B7C8932E450769FBBB8B18864D1DD70BBC67" }, { "b" : "7F48FB279000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "BF8F00D7CB849ADB0B7A4703BC7B8D66AEE6A49C" }, { "b" : "7F48FB063000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "EA8E45DC8E395CC5E26890470112D97A1F1E0B65" }, { "b" : "7F48FAE54000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "1E7A92FDD6FB3871DA97F4BCA2E147E72B6B6E1F" }, { "b" : "7F48FAC50000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7F48FAA36000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "FE7AE845A123A3DFC0FDC2408BCBC2BA8B61B158" }, { "b" : "7F48FA80F000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "76687CA31A406854DF3BCF8D03055656F56E6892" }, { "b" : "7F48FA5AE000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "AE64AA461A26E01F60408013D361749D56DD0AE1" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f48fe8dd9b1]
 mongod(+0x157DAA9) [0x7f48fe8dcaa9]
 mongod(+0x157DF8D) [0x7f48fe8dcf8d]
 libpthread.so.0(+0xF370) [0x7f48fbfb4370]
 libc.so.6(gsignal+0x37) [0x7f48fbc191d7]
 libc.so.6(abort+0x148) [0x7f48fbc1a8c8]
 mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x7f48fdb725ab]
 mongod(+0x12986E6) [0x7f48fe5f76e6]
 mongod(__wt_eventv+0x422) [0x7f48fdb7c948]
 mongod(__wt_err+0x9D) [0x7f48fdb7ca3c]
 mongod(__wt_panic+0x24) [0x7f48fdb7cc94]
 mongod(__wt_bm_read+0x135) [0x7f48ff1e30b5]
 mongod(__wt_bt_read+0x20D) [0x7f48ff200a0d]
 mongod(__wt_page_in_func+0x1138) [0x7f48ff204cb8]
 mongod(__wt_row_search+0x66D) [0x7f48ff224dfd]
 mongod(__wt_btcur_remove+0x31B) [0x7f48ff1f63cb]
 mongod(+0x1EE4E62) [0x7f48ff243e62]
mongod(_ZN5mongo21WiredTigerIndexUnique8_unindexEP11__wt_cursorRKNS_7BSONObjERKNS_8RecordIdEb+0x12D) [0x7f48fe5d200d]
 mongod(_ZN5mongo15WiredTigerIndex7unindexEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb+0x86) [0x7f48fe5c8a46]
 mongod(_ZN5mongo17IndexAccessMethod12removeOneKeyEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb+0x3A) [0x7f48fdfca67a]
 mongod(_ZN5mongo17IndexAccessMethod6removeEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPl+0xD3) [0x7f48fdfccbf3]
 mongod(_ZN5mongo12IndexCatalog14_unindexRecordEPNS_16OperationContextEPNS_17IndexCatalogEntryERKNS_7BSONObjERKNS_8RecordIdEbPl+0xD1) [0x7f48fdd72291]
 mongod(_ZN5mongo12IndexCatalog13unindexRecordEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEbPl+0x96) [0x7f48fdd72666]
 mongod(_ZN5mongo10Collection14deleteDocumentEPNS_16OperationContextERKNS_8RecordIdEPNS_7OpDebugEbb+0x17F) [0x7f48fdd4648f]
 mongod(_ZN5mongo11DeleteStage6doWorkEPm+0x51E) [0x7f48fdee8b3e]
 mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x7f48fdf0e2f3]
 mongod(_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x19A) [0x7f48fe21103a]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x4B) [0x7f48fe21195b]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x6D) [0x7f48fe211a8d]
 mongod(_ZN5mongo10TTLMonitor13doTTLForIndexEPNS_16OperationContextENS_7BSONObjE+0x184C) [0x7f48fe5ffe9c]
 mongod(_ZN5mongo10TTLMonitor9doTTLPassEv+0x460) [0x7f48fe601260]
 mongod(_ZN5mongo10TTLMonitor3runEv+0x308) [0x7f48fe6018f8]
 mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0x16D) [0x7f48fe84d04d]
 mongod(+0x1FF3860) [0x7f48ff352860]
 libpthread.so.0(+0x7DC5) [0x7f48fbfacdc5]
 libc.so.6(clone+0x6D) [0x7f48fbcdb73d]
-----  END BACKTRACE  -----



 Comments   
Comment by Kelsey Schubert [ 16/Dec/16 ]

Hi nexcode,

I'm glad to hear the repair process was successful. If you encounter this issue again, let us know and will continue to investigate. Please note that we recommend running MongoDB with RAID-10.

Kind regards,
Thomas

Comment by Oleg Trifonov [ 14/Dec/16 ]

Now it working, but we will monitor it!

Comment by Oleg Trifonov [ 14/Dec/16 ]

2016-12-14T04:03:27.264+0300 I STORAGE [initandlisten] finished checking dbs
2016-12-14T04:03:27.264+0300 I NETWORK [initandlisten] shutdown: going to close listening sockets...
2016-12-14T04:03:27.265+0300 I NETWORK [initandlisten] removing socket file: /tmp/mongodb-27017.sock
2016-12-14T04:03:27.265+0300 I NETWORK [initandlisten] shutdown: going to flush diaglog...
2016-12-14T04:03:27.268+0300 I STORAGE [initandlisten] WiredTigerKVEngine shutting down
2016-12-14T04:03:27.757+0300 I STORAGE [initandlisten] shutdown: removing fs lock...
2016-12-14T04:03:27.757+0300 I CONTROL [initandlisten] now exiting
2016-12-14T04:03:27.757+0300 I CONTROL [initandlisten] shutting down with code:0

Now I try to start database...

Comment by Oleg Trifonov [ 13/Dec/16 ]

1. WiredTiger on local SSD. Software raid 0
2. It's ok!
3. db.copyDatabase() from 3.2.8 (migrate on new server)
4. no
5. no
6. yes
7. ttl, compound, etc...
8. no

I run --repair process... It's a long time.
DB size is 100GB. We use 20 threads CPU. And we have a large number of requests.
For server status is constantly monitored. We use CentOS 7.

Comment by Kelsey Schubert [ 13/Dec/16 ]

Hi nexcode,

This assertion failure generally indicates that some or all of the data files have become corrupt in some way. It's not clear if the corruption is in the index or the data itself, and in cases like this, it's very difficult to be confident that the corruption is isolated beyond the file level.

To help us understand what's going on here, I've assembled a list of routine questions about data storage and the configuration of your environment. But, please understand that it is unlikely that we will be able to determine the root cause of this issue.

  1. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?
  2. Would you please check the integrity of your disks?
  3. Has the database always been running MongoDB 3.4.0? If not please describe the upgrade/downgrade cycles the database has been through.
  4. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?
  5. Preceding the corruption, were there any other server errors logged?
  6. Are you using journaling?
  7. What kinds of indexes do you have (TTL, etc)?
  8. Have you run out of disk space recently?

To resolve this issue, I would recommend performing a initial sync, or starting mongod with --repair.

Thank you,
Thomas

Generated at Thu Feb 08 04:15:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.