[SERVER-41173] Can't repair corrupted database from bad shutdown: Segmentation Fault 11 Created: 16/May/19  Updated: 28/May/19  Resolved: 28/May/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Lokasa Mawati Assignee: Eric Sedor
Resolution: Done Votes: 0
Labels: driver
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Mac 10.13.6 with mongod v4.0.3.


Participants:

 Description   

(edit: Transferring content from here)

I just ran into a problem in my Node.js script. All the script did was write collection.update(...) and did an upsert of a single record. Nothing fancy. And it throws this error:

MongoNetworkError: connection 0 to localhost:27017 closed
    at Socket.<anonymous> (./node_modules/mongodb-core/lib/connection/connection.js:352:9)
    at Object.onceWrapper (events.js:285:13)
    at Socket.emit (events.js:197:13)
    at TCP._handle.close (net.js:611:12)
  name: 'MongoNetworkError',
  errorLabels: [ 'TransientTransactionError' ]

So then I ran mongod --repair:

$ /usr/local/opt/mongodb30/bin/mongod --repair -v --config /usr/local/etc/mongod.conf --dbpath /Volumes/My\ Passport\ for\ Mac/mongo

...which gave this output (the final result of tail -f /usr/local/var/log/mongodb/mongo.log).

2019-05-15T20:48:38.470-0700 D INDEX    [initandlisten]    bulk commit starting for index: _id_
2019-05-15T20:48:41.429-0700 D STORAGE  [initandlisten] allocating new extent
2019-05-15T20:48:41.552-0700 D STORAGE  [initandlisten] MmapV1ExtentManager::allocateExtent desiredSize:524288 fromFreeList: 0 eloc: 10:3a279000
2019-05-15T20:48:41.597-0700 D STORAGE  [initandlisten] example.papers: clearing plan cache - collection info cache reset
2019-05-15T20:48:42.436-0700 D STORAGE  [initandlisten] example.papers: clearing plan cache - collection info cache reset
2019-05-15T20:48:42.504-0700 D STORAGE  [initandlisten] example.papers: clearing plan cache - collection info cache reset
2019-05-15T20:48:42.970-0700 I STORAGE  [initandlisten] cleaning up failed repair db: example path: /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_0
2019-05-15T20:48:43.885-0700 I -        [initandlisten] Fatal assertion 18506 DuplicateKey E11000 duplicate key error index: example.papers.$_id_ dup key: { : ObjectId('5cdbf188258c9997819e553c') }
2019-05-15T20:48:43.906-0700 I CONTROL  [initandlisten]
 0x10cef4c2a 0x10cea195e 0x10ce8e937 0x10c7668fa 0x10c7639cc 0x10c762643 0x10c7660ac 0x7fff52856015 0x7
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"10C761000","o":"793C2A","s":"_ZN5mongo15printStackTraceERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEE"},{"b":"10C761000","o":"74095E","s":"_ZN5mongo10logContextEPKc"},{"b":"10C761000","o":"72D937","s":"_ZN5mongo23fassertFailedWithStatusEiRKNS_6StatusE"},{"b":"10C761000","o":"58FA","s":"_ZN5mongoL30repairDatabasesAndCheckVersionEv"},{"b":"10C761000","o":"29CC","s":"_ZN5mongoL14_initAndListenEi"},{"b":"10C761000","o":"1643","s":"_ZN5mongo13initAndListenEi"},{"b":"10C761000","o":"50AC","s":"main"},{"b":"7FFF52855000","o":"1015","s":"start"},{"b":"0","o":"7"}],"processInfo":{ "mongodbVersion" : "3.0.12", "gitVersion" : "33934938e0e95d534cebbaff656cde916b9c3573", "uname" : { "sysname" : "Darwin", "release" : "17.7.0", "version" : "Darwin Kernel Version 17.7.0: Fri Nov  2 20:43:16 PDT 2018; root:xnu-4570.71.17~1/RELEASE_X86_64", "machine" : "x86_64" }, "somap" : [ { "path" : "/usr/local/opt/mongodb30/bin/mongod", "machType" : 2, "b" : "10C761000", "vmaddr" : "100000000", "buildId" : "239CC35B9A1A3C7AB990330C26274C60" }, { "path" : "/usr/lib/libSystem.B.dylib", "machType" : 6, "b" : "7FFF5056F000", "vmaddr" : "7FFF4D7EC000", "buildId" : "9A40C6279ADE361789E70733BDBE4E27" }, { "path" : "/usr/lib/libc++.1.dylib", "machType" : 6, "b" : "7FFF507A4000", "vmaddr" : "7FFF4DA21000", "buildId" : "7D3DACCC3804393CABC11A580FD00CB6" }, { "path" : "/usr/lib/system/libcache.dylib", "machType" : 6, "b" : "7FFF526EC000", "vmaddr" : "7FFF4F969000", "buildId" : "092479CB10083A83BECFE115F24D13C1" }, { "path" : "/usr/lib/system/libcommonCrypto.dylib", "machType" : 6, "b" : "7FFF526F1000", "vmaddr" : "7FFF4F96E000", "buildId" : "029F59859B6E3DCB9B96FD007678C6A7" }, { "path" : "/usr/lib/system/libcompiler_rt.dylib", "machType" : 6, "b" : "7FFF526FC000", "vmaddr" : "7FFF4F979000", "buildId" : "968B8E3F368132309D78BB8732024F6E" }, { "path" : "/usr/lib/system/libcopyfile.dylib", "machType" : 6, "b" : "7FFF52704000", "vmaddr" : "7FFF4F981000", "buildId" : "3885083D50D83EECB481B2E605180D7F" }, { "path" : "/usr/lib/system/libcorecrypto.dylib", "machType" : 6, "b" : "7FFF5270E000", "vmaddr" : "7FFF4F98B000", "buildId" : "495BACA267D7369DABB1FE67FAF63A6A" }, { "path" : "/usr/lib/system/libdispatch.dylib", "machType" : 6, "b" : "7FFF5281B000", "vmaddr" : "7FFF4FA98000", "buildId" : "BF3685492DFB3530B4CB31D5EDAC4F2F" }, { "path" : "/usr/lib/system/libdyld.dylib", "machType" : 6, "b" : "7FFF52855000", "vmaddr" : "7FFF4FAD2000", "buildId" : "49BF9E96829730CF8AA6128CC14054B2" }, { "path" : "/usr/lib/system/libkeymgr.dylib", "machType" : 6, "b" : "7FFF52873000", "vmaddr" : "7FFF4FAF0000", "buildId" : "E34E283E90FA3C59B48E1277CDB9CDCE" }, { "path" : "/usr/lib/system/liblaunch.dylib", "machType" : 6, "b" : "7FFF52881000", "vmaddr" : "7FFF4FAFE000", "buildId" : "B184B521FF243142AFAF23D170CF918C" }, { "path" : "/usr/lib/system/libmacho.dylib", "machType" : 6, "b" : "7FFF52882000", "vmaddr" : "7FFF4FAFF000", "buildId" : "1902A611081A3452B11EEBD1B166E831" }, { "path" : "/usr/lib/system/libquarantine.dylib", "machType" : 6, "b" : "7FFF52887000", "vmaddr" : "7FFF4FB04000", "buildId" : "26C0BA228F933A079A4EC8D53D2CE42E" }, { "path" : "/usr/lib/system/libremovefile.dylib", "machType" : 6, "b" : "7FFF5288A000", "vmaddr" : "7FFF4FB07000", "buildId" : "711E18B25BBE3211A91656740C27D17A" }, { "path" : "/usr/lib/system/libsystem_asl.dylib", "machType" : 6, "b" : "7FFF5288C000", "vmaddr" : "7FFF4FB09000", "buildId" : "39E46A6FB2283E78B83E1779F9707A39" }, { "path" : "/usr/lib/system/libsystem_blocks.dylib", "machType" : 6, "b" : "7FFF528A4000", "vmaddr" : "7FFF4FB21000", "buildId" : "17303FDF0D2D3963B05EB4DF63052D47" }, { "path" : "/usr/lib/system/libsystem_c.dylib", "machType" : 6, "b" : "7FFF528A5000", "vmaddr" : "7FFF4FB22000", "buildId" : "1187BFE8457632478177481554E1F9E7" }, { "path" : "/usr/lib/system/libsystem_configuration.dylib", "machType" : 6, "b" : "7FFF5292F000", "vmaddr" : "7FFF4FBAC000", "buildId" : "DF6B5287203E30CB994778DF446C72B8" }, { "path" : "/usr/lib/system/libsystem_coreservices.dylib", "machType" : 6, "b" : "7FFF52933000", "vmaddr" : "7FFF4FBB0000", "buildId" : "486000D3D8CB3BE78EE58BF380DE6DF7" }, { "path" : "/usr/lib/system/libsystem_darwin.dylib", "machType" : 6, "b" : "7FFF52937000", "vmaddr" : "7FFF4FBB4000", "buildId" : "09C21A4A9EE0388BA9D9DFF8F6758791" }, { "path" : "/usr/lib/system/libsystem_dnssd.dylib", "machType" : 6, "b" : "7FFF52939000", "vmaddr" : "7FFF4FBB6000", "buildId" : "976168E40F953725800FCEC6166C9E76" }, { "path" : "/usr/lib/system/libsystem_info.dylib", "machType" : 6, "b" : "7FFF52940000", "vmaddr" : "7FFF4FBBD000", "buildId" : "AB634A98B8AA3804843638261FC8EC4D" }, { "path" : "/usr/lib/system/libsystem_m.dylib", "machType" : 6, "b" : "7FFF529B1000", "vmaddr" : "7FFF4FC2E000", "buildId" : "17570F46566C39FCBEF6635A355DD549" }, { "path" : "/usr/lib/system/libsystem_malloc.dylib", "machType" : 6, "b" : "7FFF529FD000", "vmaddr" : "7FFF4FC7A000", "buildId" : "7FD437359DDD300E8C4AF909A74BDF49" }, { "path" : "/usr/lib/system/libsystem_network.dylib", "machType" : 6, "b" : "7FFF52A1D000", "vmaddr" : "7FFF4FC9A000", "buildId" : "5E86B2DE9E15335487144094ED5F698D" }, { "path" : "/usr/lib/system/libsystem_networkextension.dylib", "machType" : 6, "b" : "7FFF52B4E000", "vmaddr" : "7FFF4FDCB000", "buildId" : "9DC03712552D3AEE9519B5ED70980B70" }, { "path" : "/usr/lib/system/libsystem_notify.dylib", "machType" : 6, "b" : "7FFF52B59000", "vmaddr" : "7FFF4FDD6000", "buildId" : "08012EC02CD234BEBF93E7F56491299A" }, { "path" : "/usr/lib/system/libsystem_sandbox.dylib", "machType" : 6, "b" : "7FFF52B77000", "vmaddr" : "7FFF4FDF4000", "buildId" : "553DFCC69D313B9CAB7C30F6F265786D" }, { "path" : "/usr/lib/system/libsystem_secinit.dylib", "machType" : 6, "b" : "7FFF52B7B000", "vmaddr" : "7FFF4FDF8000", "buildId" : "DE8D14E8A2763FF8AE1377F7040F33C1" }, { "path" : "/usr/lib/system/libsystem_kernel.dylib", "machType" : 6, "b" : "7FFF5298A000", "vmaddr" : "7FFF4FC07000", "buildId" : "0B12D320E66C3DCE9713C7B8D03C97D7" }, { "path" : "/usr/lib/system/libsystem_platform.dylib", "machType" : 6, "b" : "7FFF52B63000", "vmaddr" : "7FFF4FDE0000", "buildId" : "6355EE2D54563CA8A227B96E8F1E2AF8" }, { "path" : "/usr/lib/system/libsystem_pthread.dylib", "machType" : 6, "b" : "7FFF52B6B000", "vmaddr" : "7FFF4FDE8000", "buildId" : "0E51CCBA91F234E1BF2AFEEFD3D321E4" }, { "path" : "/usr/lib/system/libsystem_symptoms.dylib", "machType" : 6, "b" : "7FFF52B7D000", "vmaddr" : "7FFF4FDFA000", "buildId" : "441C6CA057113BB18420DDAC3D5272E1" }, { "path" : "/usr/lib/system/libsystem_trace.dylib", "machType" : 6, "b" : "7FFF52B85000", "vmaddr" : "7FFF4FE02000", "buildId" : "3A6CB7068CA636168AFC14AAD7FAF187" }, { "path" : "/usr/lib/system/libunwind.dylib", "machType" : 6, "b" : "7FFF52B9A000", "vmaddr" : "7FFF4FE17000", "buildId" : "BEF3FB4956043B5F82B5332B80023AC3" }, { "path" : "/usr/lib/system/libxpc.dylib", "machType" : 6, "b" : "7FFF52BA0000", "vmaddr" : "7FFF4FE1D000", "buildId" : "0BC7AD67671D31D48B88C317B8379598" }, { "path" : "/usr/lib/closure/libclosured.dylib", "machType" : 6, "b" : "7FFF50035000", "vmaddr" : "7FFF4D2B2000", "buildId" : "112BC241662638488DD8B34B5B6F7ABC" }, { "path" : "/usr/lib/libobjc.A.dylib", "machType" : 6, "b" : "7FFF51C28000", "vmaddr" : "7FFF4EEA5000", "buildId" : "DD9E5EC5B5073249B70093433E2D5EDF" }, { "path" : "/usr/lib/libc++abi.dylib", "machType" : 6, "b" : "7FFF507FB000", "vmaddr" : "7FFF4DA78000", "buildId" : "EF5E37D711D93530BE45B986612D13E2" } ] }}
 mongod(_ZN5mongo15printStackTraceERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEE+0x3A) [0x10cef4c2a]
 mongod(_ZN5mongo10logContextEPKc+0x10E) [0x10cea195e]
 mongod(_ZN5mongo23fassertFailedWithStatusEiRKNS_6StatusE+0x107) [0x10ce8e937]
 mongod(_ZN5mongoL30repairDatabasesAndCheckVersionEv+0x35A) [0x10c7668fa]
 mongod(_ZN5mongoL14_initAndListenEi+0xE9C) [0x10c7639cc]
 mongod(_ZN5mongo13initAndListenEi+0x13) [0x10c762643]
 mongod(main+0x40C) [0x10c7660ac]
 libdyld.dylib(start+0x1) [0x7fff52856015]
 ??? [0x7]
-----  END BACKTRACE  -----
2019-05-15T20:48:43.906-0700 I -        [initandlisten]
 
***aborting after fassert() failure

It failed :/. So I tried restarting the mongod process, and got this in the logs:

2019-05-15T20:52:28.086-0700 I CONTROL  ***** SERVER RESTARTED *****
2019-05-15T20:52:28.115-0700 D SHARDING isInRangeTest passed
2019-05-15T20:52:28.115-0700 I CONTROL  [initandlisten] MongoDB starting : pid=6589 port=27017 dbpath=/Volumes/My Passport for Mac/mongo 64-bit host=Lances-MacBook-Pro.local
2019-05-15T20:52:28.115-0700 I CONTROL  [initandlisten] db version v3.0.12
2019-05-15T20:52:28.115-0700 I CONTROL  [initandlisten] git version: 33934938e0e95d534cebbaff656cde916b9c3573
2019-05-15T20:52:28.115-0700 I CONTROL  [initandlisten] build info: Darwin yosemitevm.local 14.5.0 Darwin Kernel Version 14.5.0: Thu Apr 21 20:40:54 PDT 2016; root:xnu-2782.50.3~1/RELEASE_X86_64 x86_64 BOOST_LIB_VERSION=1_49
2019-05-15T20:52:28.115-0700 I CONTROL  [initandlisten] allocator: system
2019-05-15T20:52:28.115-0700 I CONTROL  [initandlisten] options: { config: "/usr/local/etc/mongod.conf", net: { bindIp: "127.0.0.1" }, storage: { dbPath: "/Volumes/My Passport for Mac/mongo" }, systemLog: { destination: "file", logAppend: true, path: "/usr/local/var/log/mongodb/mongo.log", verbosity: 1 } }
2019-05-15T20:52:28.115-0700 D NETWORK  [initandlisten] fd limit hard:9223372036854775807 soft:4864 max conn: 3891
2019-05-15T20:52:28.209-0700 W -        [initandlisten] Detected unclean shutdown - /Volumes/My Passport for Mac/mongo/mongod.lock is not empty.
2019-05-15T20:52:28.224-0700 I STORAGE  [initandlisten] **************
old lock file: /Volumes/My Passport for Mac/mongo/mongod.lock.  probably means unclean shutdown,
but there are no journal files to recover.
this is likely human error or filesystem corruption.
please make sure that your journal directory is mounted.
found 2 dbs.
see: http://dochub.mongodb.org/core/repair for more information
*************
2019-05-15T20:52:28.224-0700 D -        [initandlisten] User Assertion: 12596:old lock file
2019-05-15T20:52:28.230-0700 I STORAGE  [initandlisten] exception in initAndListen: 12596 old lock file, terminating
2019-05-15T20:52:28.231-0700 I CONTROL  [initandlisten] dbexit:  rc: 100

Wondering what I can do given I have no backups. I am not very far along but I would like to not have to rebuild the database. Wondering if I can recover anything and get it to restart again. I'm also not really sure what exactly I did to cause this.

This is related to the error.

$ /usr/local/opt/mongodb30/bin/mongod --version
db version v3.0.12

Now, I upgraded to > 4.0, and ran start and repair, which ends up just seg faulting:

$ mongod -v --config /usr/local/etc/mongod.conf --dbpath /Volumes/My\ Passport\ for\ Mac/mongo
2019-05-15T21:28:12.894-0700 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
Error 100
$ mongod --repair -v --config /usr/local/etc/mongod.conf --dbpath /Volumes/My\ Passport\ for\ Mac/mongo
2019-05-15T21:28:46.557-0700 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
Segmentation fault: 11
$ mongod --repair --config /usr/local/etc/mongod.conf --dbpath /Volumes/My\ Passport\ for\ Mac/mongo
2019-05-15T21:29:00.268-0700 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
Segmentation fault: 11

The logs from that show:

2019-05-15T21:35:04.761-0700 I STORAGE  [initandlisten] repairDatabase local
2019-05-15T21:35:05.247-0700 I STORAGE  [initandlisten] allocating new ns file /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/local.ns, filling with zeroes...
2019-05-15T21:35:05.565-0700 I STORAGE  [initandlisten] done allocating ns file /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/local.ns, size: 16MB, took 0.317 seconds
2019-05-15T21:35:05.565-0700 I STORAGE  [FileAllocator] allocating new datafile /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/local.0
2019-05-15T21:35:05.565-0700 I STORAGE  [FileAllocator] creating directory /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/_tmp
2019-05-15T21:35:06.425-0700 I STORAGE  [FileAllocator] done allocating datafile /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/local.0, size: 64MB,  took 0.859 secs
2019-05-15T21:35:06.771-0700 W ASIO     [initandlisten] No TransportLayer configured during NetworkInterface startup
2019-05-15T21:35:06.783-0700 F -        [initandlisten] Invalid access at address: 0x0
2019-05-15T21:35:06.801-0700 F -        [initandlisten] Got signal: 11 (Segmentation fault: 11).

Original description

I have written up the detailed description here

Essentially what I'm getting now is this:

{{}}


{{$ mongod -v --config /usr/local/etc/mongod.conf --dbpath /Volumes/My\ Passport\ for\ Mac/mongo 2019-05-15T21:28:12.894-0700 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none' Error 100 }}
{{$ mongod --repair -v --config /usr/local/etc/mongod.conf --dbpath /Volumes/My\ Passport\ for\ Mac/mongo 2019-05-15T21:28:46.557-0700 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none' Segmentation fault: 11 }}
$ mongod --repair --config /usr/local/etc/mongod.conf --dbpath /Volumes/My\ Passport\ for\ Mac/mongo 2019-05-15T21:29:00.268-0700 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none' Segmentation fault: 11

And the logs show:

2019-05-15T21:35:04.761-0700 I STORAGE  [initandlisten] repairDatabase local
2019-05-15T21:35:05.247-0700 I STORAGE  [initandlisten] allocating new ns file /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/local.ns, filling with zeroes...
2019-05-15T21:35:05.565-0700 I STORAGE  [initandlisten] done allocating ns file /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/local.ns, size: 16MB, took 0.317 seconds
2019-05-15T21:35:05.565-0700 I STORAGE  [FileAllocator] allocating new datafile /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/local.0
2019-05-15T21:35:05.565-0700 I STORAGE  [FileAllocator] creating directory /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/_tmp
2019-05-15T21:35:06.425-0700 I STORAGE  [FileAllocator] done allocating datafile /Volumes/My Passport for Mac/mongo/_tmp_repairDatabase_2/local.0, size: 64MB,  took 0.859 secs
2019-05-15T21:35:06.771-0700 W ASIO     [initandlisten] No TransportLayer configured during NetworkInterface startup
2019-05-15T21:35:06.783-0700 F -        [initandlisten] Invalid access at address: 0x0
2019-05-15T21:35:06.801-0700 F -        [initandlisten] Got signal: 11 (Segmentation fault: 11).



 Comments   
Comment by Eric Sedor [ 28/May/19 ]

mawati, based on the information you've provided you may be able to restart with MongoDB 3.0 if you address the old lock file issue. But importantly, we aren't able to support MongoDB 3.0 here. The SERVER project is for bugs and feature suggestions for supported versions of the MongoDB server, so I'm going to close this ticket.

If your superuser post does not get any traction, you may find someone who can help if you post on the mongodb-user group or on Stack Overflow with the mongodb tag.

Comment by Eric Sedor [ 17/May/19 ]

mawati, we are reviewing this information and will let you know if we have questions. For future reference, a TransientTransactionError is not intended as an indication that a repair operation is necessary.

Please also be aware that MongoDB 3.0 reached end of life on February 2018, so our ability to investigate whether or not this behavior is the result of a bug could be very limited. If you can provide steps to reproduce this problem on MongoDB 3.4 starting from an empty database it will be helpful.

Generated at Thu Feb 08 04:57:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.