[SERVER-30629] Fatal Assertion 17441 at src/mongo/db/storage/mmap_v1/record_store_v1_base.cpp 282 Created: 12/Aug/17  Updated: 20/Sep/17  Resolved: 20/Aug/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Chun-wei Kuo [X] Assignee: Ramon Fernandez Marina
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

$ mongo --version
MongoDB shell version v3.4.7
git version: cf38c1b8a0a8dca4a11737581beafef4fe120bcd
OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
allocator: tcmalloc
modules: none
build environment:
distmod: ubuntu1604
distarch: x86_64
target_arch: x86_64


Issue Links:
Related
is related to SERVER-31340 Fatal Assertion 17441 at src\mongo\db... Closed
Operating System: ALL
Steps To Reproduce:

Start the service:

sudo systemctl status mongod.service

And do nothing but wait.

The service crashes after some time (within a minute).

Participants:

 Description   

I upgraded mongodb-org from 3.0 to 3.4 while upgrading the OS from Ubuntu 14.04 LTS to 16.04 LTS.

Each time

Backtrace:

2017-08-12T11:06:51.907+0000 F -        [TTLMonitor] Got signal: 6 (Aborted).
 
 0x562ddec7aea1 0x562ddec7a0b9 0x562ddec7a59d 0x7f977e594390 0x7f977e1ee428 0x7f977e1f002a 0x562dddf29733 0x562dde939f76 0x562dde939fa5 0x562dde94e8ad 0x562dde94eab7 0x562dde2ee9
bb 0x562dde295626 0x562dde2b7073 0x562dde291ab3 0x562dde2b7073 0x562dde5bc56a 0x562dde5bce8b 0x562dde5bcfbd 0x562dde997427 0x562dde99858a 0x562dde998cc8 0x562ddebed6c1 0x562ddf6e
f430 0x7f977e58a6ba 0x7f977e2c03dd
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"562DDD70F000","o":"156BEA1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"562DDD70F000","o":"156B0B9"},{"b":"562DDD70F000","o":"156B59D"},{"b":"7F977E583000","o
":"11390"},{"b":"7F977E1B9000","o":"35428","s":"gsignal"},{"b":"7F977E1B9000","o":"3702A","s":"abort"},{"b":"562DDD70F000","o":"81A733","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"562DDD70F000","o":"122AF76"},{"b":"562DDD70F000","o":"122AFA5","s":"_ZNK5mongo17RecordStoreV1Base13getNextRecordEPNS_16OperationContextERKNS_7DiskLocE"},{"b":"562DDD70F000","o":"123F8AD","s":"_ZN5mongo27SimpleRecordStoreV1Iterator7advanceEv"},{"b":"562DDD70F000","o":"123FAB7","s":"_ZN5mongo27SimpleRecordStoreV1Iterator9seekExactERKNS_8RecordIdE"},{"b":"562DDD70F000","o":"BDF9BB","s":"_ZN5mongo16WorkingSetCommon5fetchEPNS_16OperationContextEPNS_10WorkingSetEmNS_11unowned_ptrINS_20SeekableRecordCursorEEE"},{"b":"562DDD70F000","o":"B86626","s":"_ZN5mongo10FetchStage6doWorkEPm"},{"b":"562DDD70F000","o":"BA8073","s":"_ZN5mongo9PlanStage4workEPm"},{"b":"562DDD70F000","o":"B82AB3","s":"_ZN5mongo11DeleteStage6doWorkEPm"},{"b":"562DDD70F000","o":"BA8073","s":"_ZN5mongo9PlanStage4workEPm"},{"b":"562DDD70F000","o":"EAD56A","s":"_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE"},{"b":"562DDD70F000","o":"EADE8B","s":"_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE"},{"b":"562DDD70F000","o":"EADFBD","s":"_ZN5mongo12PlanExecutor11executePlanEv"},{"b":"562DDD70F000","o":"1288427","s":"_ZN5mongo10TTLMonitor13doTTLForIndexEPNS_16OperationContextENS_7BSONObjE"},{"b":"562DDD70F000","o":"128958A","s":"_ZN5mongo10TTLMonitor9doTTLPassEv"},{"b":"562DDD70F000","o":"1289CC8","s":"_ZN5mongo10TTLMonitor3runEv"},{"b":"562DDD70F000","o":"14DE6C1","s":"_ZN5mongo13BackgroundJob7jobBodyEv"},{"b":"562DDD70F000","o":"1FE0430"},{"b":"7F977E583000","o":"76BA"},{"b":"7F977E1B9000","o":"1073DD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.7", "gitVersion" : "cf38c1b8a0a8dca4a11737581beafef4fe120bcd", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.9.36-x86_64-linode85", "version" : "#1 SMP Thu Jul 6 15:31:23 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "562DDD70F000", "elfType" : 3, "buildId" : "7E08C88DF63C9DEE479A2B1C6C7E11D8651F1184" }, { "b" : "7FFC741FA000", "elfType" : 3, "buildId" : "16E7BD2A103689F0C0C45325182912A8213F5118" }, { "b" : "7F977F50F000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "675F454AD6FD0B6CA2E41127C7B98079DA37F7B6" }, { "b" : "7F977F0CB000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "2DA08A7E5BF610030DD33B70DB951399626B7496" }, { "b" : "7F977EEC3000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "F951C1E0765FCAE48F82CAFE35D1ADD36D6C9AF9" }, { "b" : "7F977ECBF000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "0FC788F0861846257B5F1773FBD438E95DFC1032" }, { "b" : "7F977E9B6000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "FF7A33D389E756CA381A8189291A968EA5E1F4F8" }, { "b" : "7F977E7A0000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "68220AE2C65D65C1B6AAA12FA6765A6EC2F5F434" }, { "b" : "7F977E583000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "27F189EF8DB8C3734C6A678E6EF3CB0B206D58B2" }, { "b" : "7F977E1B9000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "088A6E00A1814622219F346B41E775B8DD46C518" }, { "b" : "7F977F778000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9157F205547F0EB588E2AB1F2F120B74253A43EA" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x562ddec7aea1]
 mongod(+0x156B0B9) [0x562ddec7a0b9]
 mongod(+0x156B59D) [0x562ddec7a59d]
 libpthread.so.0(+0x11390) [0x7f977e594390]
 libc.so.6(gsignal+0x38) [0x7f977e1ee428]
 libc.so.6(abort+0x16A) [0x7f977e1f002a]
 mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x562dddf29733]
 mongod(+0x122AF76) [0x562dde939f76]
 mongod(_ZNK5mongo17RecordStoreV1Base13getNextRecordEPNS_16OperationContextERKNS_7DiskLocE+0x25) [0x562dde939fa5]
 mongod(_ZN5mongo27SimpleRecordStoreV1Iterator7advanceEv+0x3D) [0x562dde94e8ad]
 mongod(_ZN5mongo27SimpleRecordStoreV1Iterator9seekExactERKNS_8RecordIdE+0x87) [0x562dde94eab7]
 mongod(_ZN5mongo16WorkingSetCommon5fetchEPNS_16OperationContextEPNS_10WorkingSetEmNS_11unowned_ptrINS_20SeekableRecordCursorEEE+0xAB) [0x562dde2ee9bb]
 mongod(_ZN5mongo10FetchStage6doWorkEPm+0x106) [0x562dde295626]
 mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x562dde2b7073]
 mongod(_ZN5mongo11DeleteStage6doWorkEPm+0x363) [0x562dde291ab3]
 mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x562dde2b7073]
 mongod(_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x19A) [0x562dde5bc56a]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x4B) [0x562dde5bce8b]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x6D) [0x562dde5bcfbd]
 mongod(_ZN5mongo10TTLMonitor13doTTLForIndexEPNS_16OperationContextENS_7BSONObjE+0x1987) [0x562dde997427]
 mongod(_ZN5mongo10TTLMonitor9doTTLPassEv+0x47A) [0x562dde99858a]
 mongod(_ZN5mongo10TTLMonitor3runEv+0x328) [0x562dde998cc8]
 mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0x131) [0x562ddebed6c1]
 mongod(+0x1FE0430) [0x562ddf6ef430]
 libpthread.so.0(+0x76BA) [0x7f977e58a6ba]
 libc.so.6(clone+0x6D) [0x7f977e2c03dd]
-----  END BACKTRACE  -----}}



 Comments   
Comment by Ramon Fernandez Marina [ 20/Aug/17 ]

Domon, unfortunately the information provided points to disk corruption, so you may need to restore data from backups or resync this node from a healthy replica. We often ask the questions below to better understand how this could have happened:

  1. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?
  2. Would you please check the integrity of your disks?
  3. Has the database always been running this version of MongoDB? If not please describe the upgrade/downgrade cycles the database has been through.
  4. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?
  5. Have you ever restored this instance from backups?
  6. What method do you use to create backups?
  7. When was the underlying filesystem last checked and is it currently marked clean?
  8. Has this system suffered from unclean shutdowns?

These questions represent the most common scenarios that can lead to a behavior like the one you're describing. In any event, I'm afraid there's not much that can be done in this particular situation.

Comment by Ramon Fernandez Marina [ 14/Aug/17 ]

Sorry to hear you run into this issue Domon. If this happens again it will help to upload the full logs of the affected mongod. We're going to take a look at the stack trace to see if it provides any indication of a bug.

Comment by Chun-wei Kuo [X] [ 13/Aug/17 ]

I was frustrated and gave up debugging this.

I was lucky that I can rebuild the database so I reset the whole /var/lib/mongo directory and it works now.

Please close this issue. Thanks.

Comment by Chun-wei Kuo [X] [ 12/Aug/17 ]

Sorry I didn't finish my sentence before creating the ticket.

The service crashes after started. I have no idea how to debug further.

Generated at Thu Feb 08 04:24:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.