Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13822

Running resync before replset config is loaded can crash mongod

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.4, 2.7.1
    • Component/s: Replication
    • Labels:
      None

      Description

      Issue Status as of Jul 22, 2014

      ISSUE SUMMARY
      In a replica set, if a resync operation is attempted on a node before it loads a valid replica set config, the mongod process crashes.

      A newly started mongod with the --replSet parameter does not immediately have a config; it must first load a valid config from disk, have a config delivered to it from another node, or have the replica set initiate command run by an admin.

      USER IMPACT
      The mongod process crashes, and a stack trace is printed in the log. This only affects newly started mongod processes that have not yet had a chance to join a replica set, so the impact of this issue on a replica set is minimal.

      WORKAROUNDS
      Do not run resync on a mongod before loading a valid replica set config.

      AFFECTED VERSIONS
      MongoDB production releases from version 2.6.0 up to 2.6.3 are affected by this issue.

      FIX VERSION
      The fix is included in the 2.6.4 production release.

      RESOLUTION DETAILS
      Do not allow resync commands if the replica set config has not yet been loaded.

      Original description

      https://mci.10gen.com/ui/task/mongodb_mongo_master_osx_108_dur_off_b1300e3f5656423eac55efaedf6440ab10c37125_14_04_16_21_30_07_replicasets_osx_108_dur_off
      https://mci.10gen.com/ui/task/mongodb_mongo_master_osx_108_b1300e3f5656423eac55efaedf6440ab10c37125_14_04_16_21_30_07_replicasets_osx_108

       m31001| 2014-04-16T20:14:35.279-0400 [conn2] SEVERE: Invalid access at address: 0
       m31001| 2014-04-16T20:14:35.280-0400 [rsStart] replSet I am mci-osx108-5.build.10gen.cc:31001
       m31001| 2014-04-16T20:14:35.283-0400 [conn2] SEVERE: Got signal: 11 (Segmentation fault: 11).
       m31001| 0x1006b125b 0x1006b0dfe 0x7fff88b2790a 0 0x1001aa945 0x1001ab3db 0x1001ac09c 0x1003c0d5f 0x1002927b0 0x1000065b4 0x1006760f1 0x1006e57d5 0x7fff88b39772 0x7fff88b261a1 
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo15printStackTraceERSo+0x2b) [0x1006b125b]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo12_GLOBAL__N_124abruptQuitWithAddrSignalEiP9__siginfoPv+0xde) [0x1006b0dfe]
       m31001|  /usr/lib/system/libsystem_c.dylib(_sigtramp+0x1a) [0x7fff88b2790a]
       m31001|  ??? [0]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x25) [0x1001aa945]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x85f) [0x1001ab3db]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x56c) [0x1001ac09c]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x64f) [0x1003c0d5f]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x7b0) [0x1002927b0]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x134) [0x1000065b4]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x691) [0x1006760f1]
       m31001|  /data/mci/shell/mongodb-mongo-master/mongod(thread_proxy+0xe5) [0x1006e57d5]
       m31001|  /usr/lib/system/libsystem_c.dylib(_pthread_start+0x147) [0x7fff88b39772]
       m31001|  /usr/lib/system/libsystem_c.dylib(thread_start+0xd) [0x7fff88b261a1]

      The only change to actual code in the intersection of the blamelists is: https://github.com/mongodb/mongo/commit/0fbd76d233e213e43f53b8882c4dd3c71897a7f3

      Other changes:

      https://github.com/mongodb/mongo/commit/8bbe304cde912c0e2f96ff6b8f6e4badd90d60f0
      https://github.com/mongodb/mongo/commit/b1300e3f5656423eac55efaedf6440ab10c37125

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: