[SERVER-26036] Error when recovering failed replicaset Created: 09/Sep/16  Updated: 27/Oct/16  Resolved: 27/Oct/16

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: 3.2.9
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Vincent van Megen Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

Upgraded last member of replicaset (3 members) from 3.2.8 to 3.2.9.

Participants:

 Description   

2016-09-09T11:05:31.282Z I CONTROL  [repl writer worker 10]
 0x130c4a2 0x12a7248 0x128f5ed 0x106b627 0xbdd939 0xba534a 0xb96cdc 0xdeff55 0xdf0619 0xdf0715 0xcd4323 0xe82d8c 0xf1acee 0xf14ad0 0xf16716 0xf17316 0xf1a62b 0x12998f1 0x129a259 0x129adb0 0x1b431c0 0x7fccca784184 0x7fccca4b137d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"F0C4A2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"EA7248","s":"_ZN5mongo10logContextEPKc"},{"b":"400000","o":"E8F5ED","s":"_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j"},{"b":"400000","o":"C6B627","s":"_ZN5mongo21WiredTigerRecordStore6Cursor9seekExactERKNS_8RecordIdE"},{"b":"400000","o":"7DD939","s":"_ZN5mongo16WorkingSetCommon5fetchEPNS_16OperationContextEPNS_10WorkingSetEmNS_11unowned_ptrINS_20SeekableRecordCursorEEE"},{"b":"400000","o":"7A534A","s":"_ZN5mongo11IDHackStage4workEPm"},{"b":"400000","o":"796CDC","s":"_ZN5mongo11DeleteStage4workEPm"},{"b":"400000","o":"9EFF55","s":"_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE"},{"b":"400000","o":"9F0619","s":"_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE"},{"b":"400000","o":"9F0715","s":"_ZN5mongo12PlanExecutor11executePlanEv"},{"b":"400000","o":"8D4323","s":"_ZN5mongo13deleteObjectsEPNS_16OperationContextEPNS_10CollectionENS_10StringDataENS_7BSONObjENS_12PlanExecutor11YieldPolicyEbbb"},{"b":"400000","o":"A82D8C","s":"_ZN5mongo4repl21applyOperation_inlockEPNS_16OperationContextEPNS_8DatabaseERKNS_7BSONObjEb"},{"b":"400000","o":"B1ACEE","s":"_ZNSt17_Function_handlerIFN5mongo6StatusEPNS0_16OperationContextEPNS0_8DatabaseERKNS0_7BSONObjEbEPS9_E9_M_invokeERKSt9_Any_dataS3_S5_S8_b"},{"b":"400000","o":"B14AD0"},{"b":"400000","o":"B16716","s":"_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEbSt8functionIFNS_6StatusES3_PNS_8DatabaseES6_bEES7_IFS8_S3_S6_EES7_IFvvEE"},{"b":"400000","o":"B17316","s":"_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEb"},{"b":"400000","o":"B1A62B","s":"_ZN5mongo4repl14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE"},{"b":"400000","o":"E998F1","s":"_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE"},{"b":"400000","o":"E9A259","s":"_ZN5mongo10ThreadPool13_consumeTasksEv"},{"b":"400000","o":"E9ADB0","s":"_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKSs"},{"b":"400000","o":"17431C0","s":"execute_native_thread_routine"},{"b":"7FCCCA77C000","o":"8184"},{"b":"7FCCCA3B7000","o":"FA37D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.2.9", "gitVersion" : "22ec9e93b40c85fc7cae7d56e7d6a02fd811088c", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-87-generic", "version" : "#133-Ubuntu SMP Tue May 24 18:32:09 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "9A35851B156CCAF69D719418740E962817B11EA7" }, { "b" : "7FFD75CA2000", "elfType" : 3, "buildId" : "0EDE991F5594B040A0611CFFBDB58B8118591A55" }, { "b" : "7FCCCB69E000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "74864DB9D5F69D39A67E4755012FB6573C469B3D" }, { "b" : "7FCCCB2C2000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "AAE7CFF8351B730830BDBCE0DCABBE06574B7144" }, { "b" : "7FCCCB0BA000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "E2A6DD5048A0A051FD61043BDB69D8CC68192AB7" }, { "b" : "7FCCCAEB6000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "DA9B8C234D0FE9FD8CAAC8970A7EC1B6C8F6623F" }, { "b" : "7FCCCABB0000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "D144258E614900B255A31F3FD2283A878670D5BC" }, { "b" : "7FCCCA99A000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "36311B4457710AE5578C4BF00791DED7359DBB92" }, { "b" : "7FCCCA77C000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "31E9F21AE8C10396171F1E13DA15780986FA696C" }, { "b" : "7FCCCA3B7000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "CF699A15CAAE64F50311FC4655B86DC39A479789" }, { "b" : "7FCCCB8FD000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "D0F537904076D73F29E4A37341F8A449E2EF6CD0" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x130c4a2]
 mongod(_ZN5mongo10logContextEPKc+0x138) [0x12a7248]
 mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0xAD) [0x128f5ed]
 mongod(_ZN5mongo21WiredTigerRecordStore6Cursor9seekExactERKNS_8RecordIdE+0x147) [0x106b627]
 mongod(_ZN5mongo16WorkingSetCommon5fetchEPNS_16OperationContextEPNS_10WorkingSetEmNS_11unowned_ptrINS_20SeekableRecordCursorEEE+0x99) [0xbdd939]
 mongod(_ZN5mongo11IDHackStage4workEPm+0x21A) [0xba534a]
 mongod(_ZN5mongo11DeleteStage4workEPm+0x25C) [0xb96cdc]
 mongod(_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x275) [0xdeff55]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x39) [0xdf0619]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x55) [0xdf0715]
 mongod(_ZN5mongo13deleteObjectsEPNS_16OperationContextEPNS_10CollectionENS_10StringDataENS_7BSONObjENS_12PlanExecutor11YieldPolicyEbbb+0x223) [0xcd4323]
 mongod(_ZN5mongo4repl21applyOperation_inlockEPNS_16OperationContextEPNS_8DatabaseERKNS_7BSONObjEb+0xB9C) [0xe82d8c]
 mongod(_ZNSt17_Function_handlerIFN5mongo6StatusEPNS0_16OperationContextEPNS0_8DatabaseERKNS0_7BSONObjEbEPS9_E9_M_invokeERKSt9_Any_dataS3_S5_S8_b+0x1E) [0xf1acee]
 mongod(+0xB14AD0) [0xf14ad0]
 mongod(_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEbSt8functionIFNS_6StatusES3_PNS_8DatabaseES6_bEES7_IFS8_S3_S6_EES7_IFvvEE+0x336) [0xf16716]
 mongod(_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEb+0xE6) [0xf17316]
 mongod(_ZN5mongo4repl14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x9B) [0xf1a62b]
 mongod(_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE+0x121) [0x12998f1]
 mongod(_ZN5mongo10ThreadPool13_consumeTasksEv+0xA9) [0x129a259]
 mongod(_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKSs+0x100) [0x129adb0]
 mongod(execute_native_thread_routine+0x20) [0x1b431c0]
 libpthread.so.0(+0x8184) [0x7fccca784184]
 libc.so.6(clone+0x6D) [0x7fccca4b137d]
-----  END BACKTRACE  -----
2016-09-09T11:05:31.282Z I -        [repl writer worker 10]
 
***aborting after invariant() failure



 Comments   
Comment by Kelsey Schubert [ 27/Oct/16 ]

Hi vincentvm,

I've taken another look at the stack trace you have provided, and believe that the following message was likely logged immediately preceding the backtrace:

[repl writer worker 0] Invariant failure: seekRet resulted in status UnknownError: 5: Input/output error at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp 527

This error indicates that the underlying storage layer is at fault, and I would recommend ensuring the integrity of your storage layer.

For MongoDB-related support discussion please post on the mongodb-users group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group.

Kind regards,
Thomas

Comment by Vincent van Megen [ 11/Oct/16 ]

Unfortunately I don't have the diagnostic data anymore.

Comment by Kelsey Schubert [ 04/Oct/16 ]

Hi vincentvm,

We still need additional information to diagnose the problem. If this is still an issue for you, would you please provide the complete logs of the affected node?

Thank you,
Thomas

Comment by Kelsey Schubert [ 12/Sep/16 ]

Hi vincentvm,

Thank you for opening for opening this ticket and providing the backtrace. So we can continue to investigate this issue, would you please upload the following information?

  • the complete logs for the affected node
  • an archive of the $dbpath/diagnsostic.data directory

I've created a secure upload portal for you to use here. Files uploaded to this portal are only visible to MongoDB employees investigating this issue and are routinely deleted after some time.

Thanks again,
Thomas

Generated at Thu Feb 08 04:10:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.