Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17388

Invariant failure in MMAPv1 when disk full

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-rc11
    • Fix Version/s: 3.0.1, 3.1.0
    • Component/s: MMAPv1
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Completed:
    • Steps To Reproduce:
      Hide

      $ mongo --port 11011
      MongoDB shell version: 2.6.8
      connecting to: 127.0.0.1:11011/test
      > genstr
      function (len, chr) {
              assert.eq(typeof(len), "number");
              assert.gte(len, 0);
              len = Math.floor(len);
              if ( ! (typeof(chr) == "string" && chr.length == 1) ) {
                      chr = "x";
              }
              return (new Array(len+1)).join(chr);
      }
      > for(i=0;i<500;i++)db.foo.insert({a:genstr(15*1048576)})
      WriteResult({
              "nInserted" : 0,
              "writeError" : {
                      "code" : 12520,
                      "errmsg" : "new file allocation failure"
              }
      })
      > for(i=0;i<500;i++)db.foo.insert({a:genstr(1*1048576)})
      2015-02-26T16:43:15.849+1100 DBClientCursor::init call() failed
      2015-02-26T16:43:15.850+1100 Error: error doing query: failed at src/mongo/shell/collection.js:258
      2015-02-26T16:43:15.852+1100 trying reconnect to 127.0.0.1:11011 (127.0.0.1) failed
      2015-02-26T16:43:15.852+1100 warning: Failed to connect to 127.0.0.1:11011, reason: errno:111 Connection refused
      2015-02-26T16:43:15.852+1100 reconnect 127.0.0.1:11011 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:11011 (127.0.0.1), connection attempt failed
      2015-02-26T16:43:15.855+1100 trying reconnect to 127.0.0.1:11011 (127.0.0.1) failed
      2015-02-26T16:43:15.855+1100 warning: Failed to connect to 127.0.0.1:11011, reason: errno:111 Connection refused
      2015-02-26T16:43:15.855+1100 reconnect 127.0.0.1:11011 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:11011 (127.0.0.1), connection attempt failed
      2015-02-26T16:43:15.859+1100 trying reconnect to 127.0.0.1:11011 (127.0.0.1) failed
      2015-02-26T16:43:15.859+1100 warning: Failed to connect to 127.0.0.1:11011, reason: errno:111 Connection refused
      2015-02-26T16:43:15.859+1100 reconnect 127.0.0.1:11011 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:11011 (127.0.0.1), connection attempt failed
      >
      bye

      Show
      $ mongo --port 11011 MongoDB shell version: 2.6.8 connecting to: 127.0.0.1:11011/test > genstr function (len, chr) { assert.eq(typeof(len), "number"); assert.gte(len, 0); len = Math.floor(len); if ( ! (typeof(chr) == "string" && chr.length == 1) ) { chr = "x"; } return (new Array(len+1)).join(chr); } > for(i=0;i<500;i++)db.foo.insert({a:genstr(15*1048576)}) WriteResult({ "nInserted" : 0, "writeError" : { "code" : 12520, "errmsg" : "new file allocation failure" } }) > for(i=0;i<500;i++)db.foo.insert({a:genstr(1*1048576)}) 2015-02-26T16:43:15.849+1100 DBClientCursor::init call() failed 2015-02-26T16:43:15.850+1100 Error: error doing query: failed at src/mongo/shell/collection.js:258 2015-02-26T16:43:15.852+1100 trying reconnect to 127.0.0.1:11011 (127.0.0.1) failed 2015-02-26T16:43:15.852+1100 warning: Failed to connect to 127.0.0.1:11011, reason: errno:111 Connection refused 2015-02-26T16:43:15.852+1100 reconnect 127.0.0.1:11011 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:11011 (127.0.0.1), connection attempt failed 2015-02-26T16:43:15.855+1100 trying reconnect to 127.0.0.1:11011 (127.0.0.1) failed 2015-02-26T16:43:15.855+1100 warning: Failed to connect to 127.0.0.1:11011, reason: errno:111 Connection refused 2015-02-26T16:43:15.855+1100 reconnect 127.0.0.1:11011 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:11011 (127.0.0.1), connection attempt failed 2015-02-26T16:43:15.859+1100 trying reconnect to 127.0.0.1:11011 (127.0.0.1) failed 2015-02-26T16:43:15.859+1100 warning: Failed to connect to 127.0.0.1:11011, reason: errno:111 Connection refused 2015-02-26T16:43:15.859+1100 reconnect 127.0.0.1:11011 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:11011 (127.0.0.1), connection attempt failed > bye

      Description

      Issue Status as of Mar 10, 2015

      ISSUE SUMMARY
      MongoDB checks when files are closed or deleted to ensure that no journal entries reference these files so that the journal is always consistent with the data files. In addition, MongoDB writes data file creations to the journal, so they can be replayed if the server crashes. When data files fail to create because of an OS level error (e.g. volume out of space), MongoDB may incorrectly handle these file creation journal entries and treat them as referencing missing files.

      USER IMPACT
      When running with the MMAPv1 storage engine and journaling, mongod aborts if cannot create a new data file.

      WORKAROUNDS
      Ensure that the volume where mongod's dbpath exists, always has sufficient space to accommodate data size growth.

      AFFECTED VERSIONS: 3.0.0

      FIX VERSION
      The fix is included in the 3.0.1 production release.

      RESOLUTION DETAILS
      Change behavior when closing files to ensure that references in the journal entries are handled properly.

      Original description

      2015-02-26T16:43:15.756+1100 F JOURNAL  [conn1] journal warning files are closing outside locks with writes pending
      2015-02-26T16:43:15.756+1100 I -        [conn1] Invariant failure false src/mongo/db/storage/mmap_v1/dur.cpp 738
      2015-02-26T16:43:15.772+1100 I CONTROL  [conn1] 
       0xf25749 0xecf571 0xeb4612 0xcebafc 0xcfe9c6 0xd19a72 0xd185cc 0xd18740 0xd189ba 0xd1cfae 0xd291bc 0xd1d7a9 0xd1d959 0x8f073e 0x8f0f0c 0x98dd24 0x98f11a 0x9908b4 0x990fb5 0x99306d 0x9b31c4 0x9b4103 0x9b4bbb 0xb80ea5 0xa941fa 0x7e5320 0xee37d1 0x7f2618776182 0x7f2617876efd
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B25749"},{"b":"400000","o":"ACF571"},{"b":"400000","o":"AB4612"},{"b":"400000","o":"8EBAFC"},{"b":"400000","o":"8FE9C6"},{"b":"400000","o":"919A72"},{"b":"400000","o":"9185CC"},{"b":"400000","o":"918740"},{"b":"400000","o":"9189BA"},{"b":"400000","o":"91CFAE"},{"b":"400000","o":"9291BC"},{"b":"400000","o":"91D7A9"},{"b":"400000","o":"91D959"},{"b":"400000","o":"4F073E"},{"b":"400000","o":"4F0F0C"},{"b":"400000","o":"58DD24"},{"b":"400000","o":"58F11A"},{"b":"400000","o":"5908B4"},{"b":"400000","o":"590FB5"},{"b":"400000","o":"59306D"},{"b":"400000","o":"5B31C4"},{"b":"400000","o":"5B4103"},{"b":"400000","o":"5B4BBB"},{"b":"400000","o":"780EA5"},{"b":"400000","o":"6941FA"},{"b":"400000","o":"3E5320"},{"b":"400000","o":"AE37D1"},{"b":"7F261876E000","o":"8182"},{"b":"7F261777C000","o":"FAEFD"}],"processInfo":{ "mongodbVersion" : "2.8.0-rc5", "gitVersion" : "74b351de21c84438b12a83b28e155f5e69e3c1eb", "uname" : { "sysname" : "Linux", "release" : "3.13.0-39-generic", "version" : "#66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFFD9483000", "elfType" : 3 }, { "b" : "7F261876E000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3 }, { "b" : "7F2618566000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3 }, { "b" : "7F2618362000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3 }, { "b" : "7F261805E000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3 }, { "b" : "7F2617D58000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3 }, { "b" : "7F2617B42000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7F261777C000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3 }, { "b" : "7F261898C000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf25749]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xecf571]
       mongod(_ZN5mongo15invariantFailedEPKcS1_j+0xB2) [0xeb4612]
       mongod(_ZN5mongo3dur23closingFileNotificationEv+0x5C) [0xcebafc]
       mongod(_ZN5mongo17DurableMappedFileD1Ev+0xF6) [0xcfe9c6]
       mongod(_ZNSt8auto_ptrIN5mongo8DataFileEED1Ev+0x12) [0xd19a72]
       mongod(_ZN5mongo19MmapV1ExtentManager9_addAFileEPNS_16OperationContextEib+0x3CC) [0xd185cc]
       mongod(_ZN5mongo19MmapV1ExtentManager13_createExtentEPNS_16OperationContextEib+0xE0) [0xd18740]
       mongod(_ZN5mongo19MmapV1ExtentManager14allocateExtentEPNS_16OperationContextEbib+0x1DA) [0xd189ba]
       mongod(_ZN5mongo17RecordStoreV1Base19increaseStorageSizeEPNS_16OperationContextEib+0x3E) [0xd1cfae]
       mongod(_ZN5mongo19SimpleRecordStoreV111allocRecordEPNS_16OperationContextEib+0x1DC) [0xd291bc]
       mongod(_ZN5mongo17RecordStoreV1Base13_insertRecordEPNS_16OperationContextEPKcib+0x59) [0xd1d7a9]
       mongod(_ZN5mongo17RecordStoreV1Base12insertRecordEPNS_16OperationContextEPKcib+0x79) [0xd1d959]
       mongod(_ZN5mongo10Collection15_insertDocumentEPNS_16OperationContextERKNS_7BSONObjEb+0x4E) [0x8f073e]
       mongod(_ZN5mongo10Collection14insertDocumentEPNS_16OperationContextERKNS_7BSONObjEb+0x8C) [0x8f0f0c]
       mongod(_ZN5mongo18WriteBatchExecutor13execOneInsertEPNS0_16ExecInsertsStateEPPNS_16WriteErrorDetailE+0x214) [0x98dd24]
       mongod(_ZN5mongo18WriteBatchExecutor11execInsertsERKNS_21BatchedCommandRequestEPSt6vectorIPNS_16WriteErrorDetailESaIS6_EE+0x25A) [0x98f11a]
       mongod(_ZN5mongo18WriteBatchExecutor11bulkExecuteERKNS_21BatchedCommandRequestEPSt6vectorIPNS_19BatchedUpsertDetailESaIS6_EEPS4_IPNS_16WriteErrorDetailESaISB_EE+0x34) [0x9908b4]
       mongod(_ZN5mongo18WriteBatchExecutor12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x395) [0x990fb5]
       mongod(_ZN5mongo8WriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x15D) [0x99306d]
       mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x34) [0x9b31c4]
       mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xC13) [0x9b4103]
       mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x28B) [0x9b4bbb]
       mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_b+0x755) [0xb80ea5]
       mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortEb+0xB0A) [0xa941fa]
       mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xE0) [0x7e5320]
       mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x321) [0xee37d1]
       libpthread.so.0(+0x8182) [0x7f2618776182]
       libc.so.6(clone+0x6D) [0x7f2617876efd]
      -----  END BACKTRACE  -----
      2015-02-26T16:43:15.772+1100 I -        [conn1] 
       
      ***aborting after invariant() failure
      

      Before starting, the volume had about 7.6GB free:

      Filesystem               1K-blocks     Used Available Use% Mounted on
      /dev/cryptvg/root         72117576 60693448   7926492  89% /
      

      and after the crash, about 764MB free:

      Filesystem               1K-blocks     Used Available Use% Mounted on
      /dev/cryptvg/root         72117576 67838328    781612  99% /
      

      Inserted 15MB docs until out of space, then started inserting 1MB docs.

      mongod started normally with no special options. Repros consistently with logLevel 0, but when I tried -vv, it didn't fail. I then tried -v, and the invariant occurred again.

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: