Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14494

Dropping collection during active background index build on secondary triggers segfault

    • Fully Compatible
    • ALL

      Issue Status as of Jul 18, 2014

      ISSUE SUMMARY
      If during a background index build in a collection on a secondary node this collection gets dropped on the primary, the secondary may crash with a segmentation fault once it replicates the collection drop.

      USER IMPACT
      Only background index builds are affected. Secondary failures affect replica set quorum and can potentially lead to unavailability of the replica set.

      WORKAROUNDS
      Use foreground index builds, or ensure that the secondaries have finished building the index before dropping the specific collection in which the index is being built.

      AFFECTED VERSIONS
      Production release versions from 2.6.0 to 2.6.3 are affected by this bug.

      FIX VERSION
      The fix is included in the 2.6.4 production release.

      RESOLUTION DETAILS
      The patch now ensures that IndexBuildBlock::fail() can never throw an exception.

      Original description

      We have a repeated server crash after two days, while performing same operation (drop).

      This instance was running as a secondary in a replicaset.

      2014-07-08T08:54:19.676+0200 [repl index builder 17755] build index on: 233-340-6496.keyword_adgroup properties: { v: 1, key: { campaign: 1 }, name: "campaign_1", ns: "233-340-6496.keyword_adgroup", background: true }
      2014-07-08T08:54:19.676+0200 [repl index builder 17755]          building index in background
      2014-07-08T08:54:19.678+0200 [repl writer worker 8] CMD: drop 233-340-6496.keyword_adgroup
      2014-07-08T08:54:19.678+0200 [repl writer worker 8] halting index build: { adgroup: 1 }
      2014-07-08T08:54:19.678+0200 [repl writer worker 8] SEVERE: Invalid access at address: 0x1c00736e3a
      2014-07-08T08:54:19.684+0200 [repl writer worker 8] SEVERE: Got signal: 11 (Segmentation fault).
      Backtrace:0x11dae01 0x11da1de 0x11da2cf 0x7f7f08bb7340 0x8d5ec7 0xb856b2 0xa29ba9 0xa2a8d0 0xa2296a 0xa24958 0xa26279 0xe48ab5 0xeb27b9 0xeb2d00 0x117089e 0x121f819 0x7f7f08baf182 0x7f7f07eb430d 
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11dae01]
       /usr/bin/mongod() [0x11da1de]
       /usr/bin/mongod() [0x11da2cf]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f7f08bb7340]
       /usr/bin/mongod(_ZN5mongo12IndexCatalog23killMatchingIndexBuildsERKNS0_17IndexKillCriteriaE+0x217) [0x8d5ec7]
       /usr/bin/mongod(_ZN5mongo12IndexBuilder23killMatchingIndexBuildsEPNS_10CollectionERKNS_12IndexCatalog17IndexKillCriteriaE+0x12) [0xb856b2]
       /usr/bin/mongod(_ZN5mongo7CmdDrop15stopIndexBuildsEPNS_8DatabaseERKNS_7BSONObjE+0xe9) [0xa29ba9]
       /usr/bin/mongod(_ZN5mongo7CmdDrop3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x1c0) [0xa2a8d0]
       /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa2296a]
       /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x1618) [0xa24958]
       /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6c9) [0xa26279]
       /usr/bin/mongod(_ZN5mongo21applyOperation_inlockERKNS_7BSONObjEbb+0x8d5) [0xe48ab5]
       /usr/bin/mongod(_ZN5mongo7replset8SyncTail9syncApplyERKNS_7BSONObjEb+0x749) [0xeb27b9]
       /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x50) [0xeb2d00]
       /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x19e) [0x117089e]
       /usr/bin/mongod() [0x121f819]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7f7f08baf182]
       /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f7f07eb430d]
      

      Another crash, day before, same operation.

      2014-07-07T09:59:36.387+0200 [conn405111] end connection 192.168.0.1:51685 (32 connections now open)
      2014-07-07T09:59:36.388+0200 [initandlisten] connection accepted from 192.168.0.1:51968 #405172 (33 connections now open)
      2014-07-07T09:59:37.589+0200 [conn405168] end connection 172.17.0.13:37430 (32 connections now open)
      2014-07-07T09:59:38.533+0200 [repl index builder 11740] build index on: 558-613-4859.keyword_adgroup properties: { v: 1, key: { campaign: 1 }, name: "campaign_1", ns: "558-613-4859.keyword_adgroup", background: true }
      2014-07-07T09:59:38.533+0200 [repl index builder 11740] building index in background
      2014-07-07T09:59:38.535+0200 [repl writer worker 8] CMD: drop 558-613-4859.keyword_adgroup
      2014-07-07T09:59:38.535+0200 [repl writer worker 8] halting index build: { campaign: 1 }
      2014-07-07T09:59:38.535+0200 [repl writer worker 8] halting index build: { adgroup: 1 }
      2014-07-07T09:59:38.535+0200 [repl writer worker 8] SEVERE: Invalid access at address: 0x1c00736e3a
      2014-07-07T09:59:38.541+0200 [repl writer worker 8] SEVERE: Got signal: 11 (Segmentation fault).
      Backtrace:0x11dae01 0x11da1de 0x11da2cf 0x7fbb88e7c340 0x8d5ec7 0xb856b2 0xa29ba9 0xa2a8d0 0xa2296a 0xa24958 0xa26279 0xe48ab5 0xeb27b9 0xeb2d00 0x117089e 0x121f819 0x7fbb88e74182 0x7fbb8817930d 
      /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11dae01]
      /usr/bin/mongod() [0x11da1de]
      /usr/bin/mongod() [0x11da2cf]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7fbb88e7c340]
      /usr/bin/mongod(_ZN5mongo12IndexCatalog23killMatchingIndexBuildsERKNS0_17IndexKillCriteriaE+0x217) [0x8d5ec7]
      /usr/bin/mongod(_ZN5mongo12IndexBuilder23killMatchingIndexBuildsEPNS_10CollectionERKNS_12IndexCatalog17IndexKillCriteriaE+0x12) [0xb856b2]
      /usr/bin/mongod(_ZN5mongo7CmdDrop15stopIndexBuildsEPNS_8DatabaseERKNS_7BSONObjE+0xe9) [0xa29ba9]
      /usr/bin/mongod(_ZN5mongo7CmdDrop3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x1c0) [0xa2a8d0]
      /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa2296a]
      /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x1618) [0xa24958]
      /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6c9) [0xa26279]
      /usr/bin/mongod(_ZN5mongo21applyOperation_inlockERKNS_7BSONObjEbb+0x8d5) [0xe48ab5]
      /usr/bin/mongod(_ZN5mongo7replset8SyncTail9syncApplyERKNS_7BSONObjEb+0x749) [0xeb27b9]
      /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x50) [0xeb2d00]
      /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x19e) [0x117089e]
      /usr/bin/mongod() [0x121f819]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182) [0x7fbb88e74182]
      /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fbb8817930d]
      

      Mongo 2.6.2 is runnig on top of Ubuntu 14.04 x86_64 (10gen mongo packages)

      mongod.conf:

      dbpath=/var/lib/mongodb
      logpath=/var/log/mongodb/mongod.log
      logappend=true
      smallfiles = true
      oplogSize=20480
      replSet=rs0
      

            Assignee:
            milkie@mongodb.com Eric Milkie
            Reporter:
            bercab Bernardo Cabezas Serra
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: