[SERVER-31126]  Fatal Assertion 28558 Created: 18/Sep/17  Updated: 07/Nov/17  Resolved: 29/Sep/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.16
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Stuart Munro Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   
  • We are using MongoDB Cloud Manager, but provision the instance ourselves. Cloud Manager installs the mongod, mongos binaries (we only install the cloud manager agent)
  • Running in AWS using an i3.2xlarge instance
  • Storage in local NVMe SSD formatted as XFS, no RAID

root@ip-10-0-11-21:/data/engine_53# cat automation-mongod.conf
# THIS FILE IS MAINTAINED BY https://api-agents.mongodb.com . DO NOT MODIFY AS IT WILL BE OVERWRITTEN.
# To make changes to your MongoDB deployment, please visit https://cloud.mongodb.com . Your Group ID is 596f7cd93b34b94d9a2d89d6 .
net:
  port: 27017
processManagement:
  fork: "true"
replication:
  replSetName: engine
storage:
  dbPath: /data/engine_53
  engine: wiredTiger
systemLog:
  destination: file
  path: /data/engine_53/mongodb.log

2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 06:32:31:167
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 07:38:01:194
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 08:34:30:251
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 08:43:11:6c8e
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 09:50:40:3425
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 09:52:16:12bf
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 09:54:14:2703
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 09:56:10:483a
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 09:57:47:606d
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 10:52:00:7edd
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 10:53:45:5ee0
2017-09-18T12:37:49.126+0000 I STORAGE  [initandlisten] Placing a marker at optime Sep 18 10:55:14:31f5
2017-09-18T12:37:49.406+0000 I REPL     [initandlisten] Did not find local voted for document at startup.
2017-09-18T12:37:49.406+0000 I REPL     [initandlisten] Replaying stored operations from (term: -1, timestamp: Sep 18 10:55:30:1536) (exclusive) to (term: -1, timestamp: Sep 18 10:55:30:159a) (inclusive).
2017-09-18T12:37:49.410+0000 E STORAGE  [initandlisten] WiredTiger (0) [1505738269:410143][3060:0x7f44975f5c80], file:index-48--2239670281243746530.wt, WT_CURSOR.remove: read checksum error for 16384B block at offset 60461056: block header checksum of 730440730 doesn't match expected checksum of 2633426287
2017-09-18T12:37:49.410+0000 E STORAGE  [initandlisten] WiredTiger (0) [1505738269:410167][3060:0x7f44975f5c80], file:index-48--2239670281243746530.wt, WT_CURSOR.remove: index-48--2239670281243746530.wt: encountered an illegal file format or internal value
2017-09-18T12:37:49.410+0000 E STORAGE  [initandlisten] WiredTiger (-31804) [1505738269:410176][3060:0x7f44975f5c80], file:index-48--2239670281243746530.wt, WT_CURSOR.remove: the process must exit and restart: WT_PANIC: WiredTiger library panic
2017-09-18T12:37:49.410+0000 I -        [initandlisten] Fatal Assertion 28558
2017-09-18T12:37:49.410+0000 I -        [initandlisten]
 
***aborting after fassert() failure
 
 
2017-09-18T12:37:49.429+0000 F -        [initandlisten] Got signal: 6 (Aborted).
 
 0x1556b32 0x1555ad9 0x1556342 0x7f44961fa390 0x7f4495e54428 0x7f4495e5602a 0x14d2cf3 0x1274eea 0x97daa9 0x97dc8f 0x97de55 0x1b9d565 0x1bb6a9b 0x1bbdae2 0x1bde340 0x1baf55c 0x1bfa6ea 0x1248615 0x1247517 0xd90b6b 0xd91388 0xb56dde 0xb57257 0xb2ef24 0xcadb5f 0xf5a418 0xf5ab9c 0xf5ac7d 0xe09cfb 0x100fa04 0x10cc994 0x10c4a09 0x10c550a 0x10c61a5 0x104fe71 0x105d923 0x106248c 0x9cbd1f 0x97ec2a 0x7f4495e3f830 0x9c5d19
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"1156B32","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"1155AD9"},{"b":"400000","o":"1156342"},{"b":"7F44961E9000","o":"11390"},{"b":"7F4495E1F000","o":"35428","s":"gsignal"},{"b":"7F4495E1F000","o":"3702A","s":"abort"},{"b":"400000","o":"10D2CF3","s":"_ZN5mongo13fassertFailedEi"},{"b":"400000","o":"E74EEA"},{"b":"400000","o":"57DAA9","s":"__wt_eventv"},{"b":"400000","o":"57DC8F","s":"__wt_err"},{"b":"400000","o":"57DE55","s":"__wt_panic"},{"b":"400000","o":"179D565","s":"__wt_bm_read"},{"b":"400000","o":"17B6A9B","s":"__wt_bt_read"},{"b":"400000","o":"17BDAE2","s":"__wt_page_in_func"},{"b":"400000","o":"17DE340","s":"__wt_row_search"},{"b":"400000","o":"17AF55C","s":"__wt_btcur_remove"},{"b":"400000","o":"17FA6EA"},{"b":"400000","o":"E48615","s":"_ZN5mongo23WiredTigerIndexStandard8_unindexEP11__wt_cursorRKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"400000","o":"E47517","s":"_ZN5mongo15WiredTigerIndex7unindexEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"400000","o":"990B6B","s":"_ZN5mongo17IndexAccessMethod12removeOneKeyEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"400000","o":"991388","s":"_ZN5mongo17IndexAccessMethod6removeEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPl"},{"b":"400000","o":"756DDE","s":"_ZN5mongo12IndexCatalog14_unindexRecordEPNS_16OperationContextEPNS_17IndexCatalogEntryERKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"400000","o":"757257","s":"_ZN5mongo12IndexCatalog13unindexRecordEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"400000","o":"72EF24","s":"_ZN5mongo10Collection14deleteDocumentEPNS_16OperationContextERKNS_8RecordIdEbb"},{"b":"400000","o":"8ADB5F","s":"_ZN5mongo11DeleteStage4workEPm"},{"b":"400000","o":"B5A418","s":"_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE"},{"b":"400000","o":"B5AB9C","s":"_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE"},{"b":"400000","o":"B5AC7D","s":"_ZN5mongo12PlanExecutor11executePlanEv"},{"b":"400000","o":"A09CFB","s":"_ZN5mongo13deleteObjectsEPNS_16OperationContextEPNS_10CollectionENS_10StringDataENS_7BSONObjENS_12PlanExecutor11YieldPolicyEbbb"},{"b":"400000","o":"C0FA04","s":"_ZN5mongo4repl21applyOperation_inlockEPNS_16OperationContextEPNS_8DatabaseERKNS_7BSONObjEb"},{"b":"400000","o":"CCC994","s":"_ZNSt17_Function_handlerIFN5mongo6StatusEPNS0_16OperationContextEPNS0_8DatabaseERKNS0_7BSONObjEbEPS9_E9_M_invokeERKSt9_Any_dataOS3_OS5_S8_Ob"},{"b":"400000","o":"CC4A09"},{"b":"400000","o":"CC550A","s":"_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEbSt8functionIFNS_6StatusES3_PNS_8DatabaseES6_bEES7_IFS8_S3_S6_bEES7_IFvvEE"},{"b":"400000","o":"CC61A5","s":"_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEb"},{"b":"400000","o":"C4FE71","s":"_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl21cleanUpLastApplyBatchEPNS_16OperationContextE"},{"b":"400000","o":"C5D923","s":"_ZN5mongo4repl26ReplicationCoordinatorImpl21_startLoadLocalConfigEPNS_16OperationContextE"},{"b":"400000","o":"C6248C","s":"_ZN5mongo4repl26ReplicationCoordinatorImpl16startReplicationEPNS_16OperationContextE"},{"b":"400000","o":"5CBD1F"},{"b":"400000","o":"57EC2A","s":"main"},{"b":"7F4495E1F000","o":"20830","s":"__libc_start_main"},{"b":"400000","o":"5C5D19","s":"_start"}],"processInfo":{ "mongodbVersion" : "3.2.16", "gitVersion" : "056bf45128114e44c5358c7a8776fb582363e094", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.4.0-64-generic", "version" : "#85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "B4C77D1B42936B23E28A2739927CB25274DB2D96" }, { "b" : "7FFE2A196000", "elfType" : 3, "buildId" : "DA4CF76B76E77C8549F440C5059137F1C4809F3C" }, { "b" : "7F4497175000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "7F514146540382F59AD705BA8C913A75204C6858" }, { "b" : "7F4496D31000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "E6D4D2E4A048992CD5501E5985094E6CEC6C5D4F" }, { "b" : "7F4496B29000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "F951C1E0765FCAE48F82CAFE35D1ADD36D6C9AF9" }, { "b" : "7F4496925000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "0FC788F0861846257B5F1773FBD438E95DFC1032" }, { "b" : "7F449661C000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "FF7A33D389E756CA381A8189291A968EA5E1F4F8" }, { "b" : "7F4496406000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "68220AE2C65D65C1B6AAA12FA6765A6EC2F5F434" }, { "b" : "7F44961E9000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "27F189EF8DB8C3734C6A678E6EF3CB0B206D58B2" }, { "b" : "7F4495E1F000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "088A6E00A1814622219F346B41E775B8DD46C518" }, { "b" : "7F44973DE000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9157F205547F0EB588E2AB1F2F120B74253A43EA" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x1556b32]
 mongod(+0x1155AD9) [0x1555ad9]
 mongod(+0x1156342) [0x1556342]
 libpthread.so.0(+0x11390) [0x7f44961fa390]
 libc.so.6(gsignal+0x38) [0x7f4495e54428]
 libc.so.6(abort+0x16A) [0x7f4495e5602a]
 mongod(_ZN5mongo13fassertFailedEi+0x93) [0x14d2cf3]
 mongod(+0xE74EEA) [0x1274eea]
 mongod(__wt_eventv+0x3BA) [0x97daa9]
 mongod(__wt_err+0x8B) [0x97dc8f]
 mongod(__wt_panic+0x24) [0x97de55]
 mongod(__wt_bm_read+0x115) [0x1b9d565]
 mongod(__wt_bt_read+0x1DB) [0x1bb6a9b]
 mongod(__wt_page_in_func+0x1272) [0x1bbdae2]
 mongod(__wt_row_search+0x640) [0x1bde340]
 mongod(__wt_btcur_remove+0xE2C) [0x1baf55c]
 mongod(+0x17FA6EA) [0x1bfa6ea]
 mongod(_ZN5mongo23WiredTigerIndexStandard8_unindexEP11__wt_cursorRKNS_7BSONObjERKNS_8RecordIdEb+0xC5) [0x1248615]
 mongod(_ZN5mongo15WiredTigerIndex7unindexEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb+0x77) [0x1247517]
 mongod(_ZN5mongo17IndexAccessMethod12removeOneKeyEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb+0x2B) [0xd90b6b]
 mongod(_ZN5mongo17IndexAccessMethod6removeEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPl+0xB8) [0xd91388]
 mongod(_ZN5mongo12IndexCatalog14_unindexRecordEPNS_16OperationContextEPNS_17IndexCatalogEntryERKNS_7BSONObjERKNS_8RecordIdEb+0x8E) [0xb56dde]
 mongod(_ZN5mongo12IndexCatalog13unindexRecordEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb+0x77) [0xb57257]
 mongod(_ZN5mongo10Collection14deleteDocumentEPNS_16OperationContextERKNS_8RecordIdEbb+0x1B4) [0xb2ef24]
 mongod(_ZN5mongo11DeleteStage4workEPm+0x58F) [0xcadb5f]
 mongod(_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x368) [0xf5a418]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x3C) [0xf5ab9c]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x5D) [0xf5ac7d]
 mongod(_ZN5mongo13deleteObjectsEPNS_16OperationContextEPNS_10CollectionENS_10StringDataENS_7BSONObjENS_12PlanExecutor11YieldPolicyEbbb+0x22B) [0xe09cfb]
 mongod(_ZN5mongo4repl21applyOperation_inlockEPNS_16OperationContextEPNS_8DatabaseERKNS_7BSONObjEb+0xB64) [0x100fa04]
 mongod(_ZNSt17_Function_handlerIFN5mongo6StatusEPNS0_16OperationContextEPNS0_8DatabaseERKNS0_7BSONObjEbEPS9_E9_M_invokeERKSt9_Any_dataOS3_OS5_S8_Ob+0x24) [0x10cc994]
 mongod(+0xCC4A09) [0x10c4a09]
 mongod(_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEbSt8functionIFNS_6StatusES3_PNS_8DatabaseES6_bEES7_IFS8_S3_S6_bEES7_IFvvEE+0x37A) [0x10c550a]
 mongod(_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEb+0xE5) [0x10c61a5]
 mongod(_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl21cleanUpLastApplyBatchEPNS_16OperationContextE+0x1511) [0x104fe71]
 mongod(_ZN5mongo4repl26ReplicationCoordinatorImpl21_startLoadLocalConfigEPNS_16OperationContextE+0x3B3) [0x105d923]
 mongod(_ZN5mongo4repl26ReplicationCoordinatorImpl16startReplicationEPNS_16OperationContextE+0x17C) [0x106248c]
 mongod(+0x5CBD1F) [0x9cbd1f]
 mongod(main+0x73A) [0x97ec2a]
 libc.so.6(__libc_start_main+0xF0) [0x7f4495e3f830]
 mongod(_start+0x29) [0x9c5d19]
-----  END BACKTRACE  -----



 Comments   
Comment by Kelsey Schubert [ 29/Sep/17 ]

Hi stuart@evrythng.com,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved.

Kind regards,
Kelsey

Comment by Mark Agarunov [ 18/Sep/17 ]

Hello stuart@evrythng.com,

Thank you for the report. Unfortunately, this assertion failure generally indicates that some or all of the data files have become corrupt in some way.

To help us understand what's going on here, I've assembled a list of routine questions about data storage and the configuration of your environment. But, please understand that it is unlikely that we will be able to determine the root cause of this issue.

  1. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?
  2. Would you please check the integrity of your disks?
  3. Has the database always been running MongoDB 3.4.0? If not please describe the upgrade/downgrade cycles the database has been through.
  4. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?
  5. Preceding the corruption, were there any other server errors logged?
  6. Are you using journaling?
  7. What kinds of indexes do you have (TTL, etc)?
  8. Have you run out of disk space recently?

To resolve this issue, I would recommend performing a initial sync, or starting mongod with --repair.

Thanks,
Mark

Generated at Thu Feb 08 04:26:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.