Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46876

During the eviction pressure, we should quit the compact operation instead of crashing the process

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.6.17, 4.0.16
    • Fix Version/s: 4.9.0, 4.0.23, 4.4.4, 4.2.13
    • Component/s: Storage
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.4, v4.2, v4.0
    • Sprint:
      Execution Team 2021-01-25
    • Case:

      Description

      Currently, we fail with this invariant if an eviction pressure is detected:

      2020-03-09T15:02:13.663+0000 E STORAGE  [conn316] WiredTiger error (16) [1583766133:663912][13639:0x7f24aa4dc700], WT_SESSION.compact: __compact_worker, 302: compaction halted by eviction pressure: Device or resource busy Raw: [1583766133:663912][13639:0x7f24aa4dc700], WT_SESSION.compact: __compact_worker, 302: compaction halted by eviction pressure: Device or resource busy
      2020-03-09T15:02:13.663+0000 F -        [conn316] Invariant failure: ret resulted in status UnknownError: 16: Device or resource busy at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp 1516
      2020-03-09T15:02:13.664+0000 F -        [conn316]
       
      ***aborting after invariant() failure
       
      2020-03-09T15:02:13.689+0000 F -        [conn316] Got signal: 6 (Aborted).
       0x55ca8cd9c541 0x55ca8cd9b759 0x55ca8cd9bc3d 0x7f24d60735f0 0x7f24d5ccc337 0x7f24d5ccda28 0x55ca8b29e516 0x55ca8b6c095a 0x55ca8bad6698 0x55ca8b93d4bb 0x55ca8c7d88e6 0x55ca8c7df2f9 0x55ca8b3dfd6e 0x55ca8b3e1c69 0x55ca8b3e2bb1 0x55ca8b3ce85a 0x55ca8b3daf8a 0x55ca8b3d61e7 0x55ca8b3d9a01 0x55ca8c5c86f2 0x55ca8b3d43d0 0x55ca8b3d7515 0x55ca8b3d5927 0x55ca8b3d626d 0x55ca8b3d9a01 0x55ca8c5c8c55 0x55ca8ccf39d4 0x7f24d606be65 0x7f24d5d9488d
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"55CA8A84D000","o":"254F541","s":"_ZN5mongo15printStackTraceERSo"},{"b":"55CA8A84D000","o":"254E759"},{"b":"55CA8A84D000","o":"254EC3D"},{"b":"7F24D6064000","o":"F5F0"},{"b":"7F24D5C96000","o":"36337","s":"gsignal"},{"b":"7F24D5C96000","o":"37A28","s":"abort"},{"b":"55CA8A84D000","o":"A51516","s":"_ZN5mongo24invariantOKFailedWithMsgEPKcRKNS_6StatusERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES1_j"},{"b":"55CA8A84D000","o":"E7395A","s":"_ZN5mongo21WiredTigerRecordStore7compactEPNS_16OperationContextEPNS_25RecordStoreCompactAdaptorEPKNS_14CompactOptionsEPNS_12CompactStatsE"},{"b":"55CA8A84D000","o":"1289698","s":"_ZN5mongo14CollectionImpl7compactEPNS_16OperationContextEPKNS_14CompactOptionsE"},{"b":"55CA8A84D000","o":"10F04BB","s":"_ZN5mongo10CompactCmd9errmsgRunEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKNS_7BSONObjERS8_RNS_14BSONObjBuilderE"},{"b":"55CA8A84D000","o":"1F8B8E6","s":"_ZN5mongo23ErrmsgCommandDeprecated3runEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKNS_7BSONObjERNS_14BSONObjBuilderE"},{"b":"55CA8A84D000","o":"1F922F9","s":"_ZN5mongo12BasicCommand10Invocation3runEPNS_16OperationContextEPNS_19CommandReplyBuilderE"},{"b":"55CA8A84D000","o":"B92D6E"},{"b":"55CA8A84D000","o":"B94C69"},{"b":"55CA8A84D000","o":"B95BB1","s":"_ZN5mongo23ServiceEntryPointCommon13handleRequestEPNS_16OperationContextERKNS_7MessageERKNS0_5HooksE"},{"b":"55CA8A84D000","o":"B8185A","s":"_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE"},{"b":"55CA8A84D000","o":"B8DF8A","s":"_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE"},{"b":"55CA8A84D000","o":"B891E7","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},{"b":"55CA8A84D000","o":"B8CA01"},{"b":"55CA8A84D000","o":"1D7B6F2","s":"_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE"},{"b":"55CA8A84D000","o":"B873D0","s":"_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE"},{"b":"55CA8A84D000","o":"B8A515","s":"_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE"},{"b":"55CA8A84D000","o":"B88927","s":"_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE"},{"b":"55CA8A84D000","o":"B8926D","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},{"b":"55CA8A84D000","o":"B8CA01"},{"b":"55CA8A84D000","o":"1D7BC55"},{"b":"55CA8A84D000","o":"24A69D4"},{"b":"7F24D6064000","o":"7E65"},{"b":"7F24D5C96000","o":"FE88D","s":"clone"}],"processInfo":{ "mongodbVersion" : "4.0.16", "gitVersion" : "2a5433168a53044cb6b4fa8083e4cfd7ba142221", "compiledModules" : [ "enterprise" ], "uname" : { "sysname" : "Linux", "release" : "3.10.0-1062.12.1.el7.x86_64", "version" : "#1 SMP Tue Feb 4 23:02:59 UTC 2020", "machine" : "x86_64" }, "somap" : [ { "b" : "55CA8A84D000", "elfType" : 3, "buildId" : "E771748E4A839BBBD202C6EE993FAAA39DB36DAD" }, { "b" : "7FFE334E1000", "elfType" : 3, "buildId" : "4AF65CC22641CA1EF6020AAC0B8769BA121B370E" }, { "b" : "7F24D904F000", "path" : "/usr/lib64/libldap_r/libldap-2.4.so.2", "elfType" : 3, "buildId" : "E17DAD36A8A8D068135B66CFF68E2E55C0B7ECB9" }, { "b" : "7F24D8E40000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "3192C56CD451E18EB9F29CB045432BA9C738DD29" }, { "b" : "7F24D8987000", "path" : "/lib64/libnetsnmpmibs.so.31", "elfType" : 3, "buildId" : "F81FF95F7D949F4600F793CD931E9D1AAA574A9D" }, { "b" : "7F24D8778000", "path" : "/lib64/libsensors.so.4", "elfType" : 3, "buildId" : "A2ACE3E193F25778AA87D2E221945FDCCFCF220F" }, { "b" : "7F24D8574000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "18113E6E83D8E981B8E8D808F7F3DBB23F950A1D" }, { "b" : "7F24D830C000", "path" : "/lib64/librpm.so.3", "elfType" : 3, "buildId" : "54CE5D0D50631EC1887BC8C7BBD0B91C1A9484E9" }, { "b" : "7F24D80DF000", "path" : "/lib64/librpmio.so.3", "elfType" : 3, "buildId" : "E1EBFDA8DAE64D8A88790EDF43107FBA7E5247BA" }, { "b" : "7F24D7E70000", "path" : "/lib64/libnetsnmpagent.so.31", "elfType" : 3, "buildId" : "364D0B1B785E4EDDC1D6DC8D93560DDCB0ADB069" }, { "b" : "7F24D7C65000", "path" : "/lib64/libwrap.so.0", "elfType" : 3, "buildId" : "8C4AA46577D3AA7EBF8338BDFAECC6586EF29906" }, { "b" : "7F24D7962000", "path" : "/lib64/libnetsnmp.so.31", "elfType" : 3, "buildId" : "1B2EFF0A2F1F6B442E4CF9762FDEA5607BE3149C" }, { "b" : "7F24D76F0000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "3B305C3BA17FE394862E749763F2956C9C890C2E" }, { "b" : "7F24D728D000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "4CF1939F660008CFA869D8364651F31AACD2C1C4" }, { "b" : "7F24D7070000", "path" : "/lib64/libsasl2.so.3", "elfType" : 3, "buildId" : "E2F2017F821DD1B9D307DA1A9B8014F2941AEB7B" }, { "b" : "7F24D6E23000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "E2AA8CA3D3164E7DBEC293BFA0B55D2B10DAC05D" }, { "b" : "7F24D6BB9000", "path" : "/lib64/libcurl.so.4", "elfType" : 3, "buildId" : "7C71A471444AD18F73AFAEA3EB42431A6DA96534" }, { "b" : "7F24D68B7000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "5681C054FDABCF789F4DDA66E94F1F6ED1747327" }, { "b" : "7F24D669E000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "3009B26B33156EAAF99787AA3DA0C6AE99649755" }, { "b" : "7F24D6496000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "4749697BF078337576C4629F0D30B296A0939779" }, { "b" : "7F24D6280000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "DAC0179F4555AEFEC9E97476201802FD20C03EC5" }, { "b" : "7F24D6064000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "8B33F7F8C86F8D544C63C5541A8E42B3DDFEF8B1" }, { "b" : "7F24D5C96000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "398944D32CF16A67AF51067A326E6C0CC14F90ED" }, { "b" : "7F24D92AE000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5CC1A53B747A7E4D21198723C2B633E54F3C06D9" }, { "b" : "7F24D5A3D000", "path" : "/lib64/libssl3.so", "elfType" : 3, "buildId" : "B6321C434B5C7386B144B925CEE2798D269FDDF5" }, { "b" : "7F24D5815000", "path" : "/lib64/libsmime3.so", "elfType" : 3, "buildId" : "BDA454441F59F41D2DA36E13CEA1FC4CE95B2BBB" }, { "b" : "7F24D54E6000", "path" : "/lib64/libnss3.so", "elfType" : 3, "buildId" : "DC3B36B530F506DE4FC1A6612D7DF44D4A3DDCDB" }, { "b" : "7F24D52B6000", "path" : "/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "32C8FB6C2768FFE41E0A15CBF2089A4202CA2290" }, { "b" : "7F24D50B2000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "325B8CE57A776DE0B24B362A7E0C90E903B1A4B8" }, { "b" : "7F24D4EAD000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "0460FF10A3C63749113D380C40E10DFCF066C76E" }, { "b" : "7F24D4C6F000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "8840B019EDB66B0CFBD2F77EF196440F7928106E" }, { "b" : "7F24D48E1000", "path" : "/usr/lib64/perl5/CORE/libperl.so", "elfType" : 3, "buildId" : "E2C3C10A756404CC8888CD6CA8DFAD26064EF3CB" }, { "b" : "7F24D46C7000", "path" : "/lib64/libnsl.so.1", "elfType" : 3, "buildId" : "DD24971BA9AB317654ED2C1DCEB76BBDCDA5A6D1" }, { "b" : "7F24D4490000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "84467C988F41D853C58353BEB247670E15DA8BAD" }, { "b" : "7F24D428D000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "E0D39E293DC99997E7B4C9B6203301E6CD904B50" }, { "b" : "7F24D407D000", "path" : "/lib64/libbz2.so.1", "elfType" : 3, "buildId" : "0C85C0386F0CF41EA39969CF7F58A558D1AD3235" }, { "b" : "7F24D3E67000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "B9D5F73428BD6AD68C96986B57BEA3B7CEDB9745" }, { "b" : "7F24D3C4F000", "path" : "/lib64/libelf.so.1", "elfType" : 3, "buildId" : "F580CBEA123378EEDE9427F54758697A458411F5" }, { "b" : "7F24D3A29000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "3B2C97C1937B73A69C412A96D0810C43DF0C6F54" }, { "b" : "7F24D381F000", "path" : "/lib64/libpopt.so.0", "elfType" : 3, "buildId" : "7AE00165FBAF6920DD5AED6905820DDBAB589E84" }, { "b" : "7F24D35F8000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "D2DD4DA3FDE1477D25BFFF80F3A25FDB541A8179" }, { "b" : "7F24D33F3000", "path" : "/lib64/libcap.so.2", "elfType" : 3, "buildId" : "3BC565E0565C33B1BD37AE0070F7D8E2CE4313E4" }, { "b" : "7F24D31EA000", "path" : "/lib64/libacl.so.1", "elfType" : 3, "buildId" : "7F39882FC0B80BE53790C2EAC307D39F7DE1AD6E" }, { "b" : "7F24D2FBC000", "path" : "/lib64/liblua-5.1.so", "elfType" : 3, "buildId" : "BDD4B9CFC1D3F31D3A5A430D2F9080E020C5B0BA" }, { "b" : "7F24D2BFD000", "path" : "/lib64/libdb-5.3.so", "elfType" : 3, "buildId" : "CA8916E2C5EB6FF8582E059700E3347178823728" }, { "b" : "7F24D29D4000", "path" : "/lib64/libaudit.so.1", "elfType" : 3, "buildId" : "2E36E1B9A2D92C969E38CDDCC729F55D8BACBB2B" }, { "b" : "7F24D26EB000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "3EE7267AF7BFD3B132E6A222D997DA09C96C90DD" }, { "b" : "7F24D24E7000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "67E935BFABA2C914C01156B88947DD515EA51170" }, { "b" : "7F24D22B4000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "82E28CACB60C27CD6F14A6D2268F0CFF621664D0" }, { "b" : "7F24D20A4000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "4F5FBB2087BE132892467C4E7A46A3D07E5DA40B" }, { "b" : "7F24D1EA0000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7F24D1C6D000", "path" : "/lib64/libidn.so.11", "elfType" : 3, "buildId" : "2B77BBEFFF65E94F3E0B71A4E89BEB68C4B476C5" }, { "b" : "7F24D1A40000", "path" : "/lib64/libssh2.so.1", "elfType" : 3, "buildId" : "1AF123CADB2F2910E89CBD540A06D3B33692F95E" }, { "b" : "7F24D183D000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "197680DAE6538245CB99723E57447C4EF2E98362" }, { "b" : "7F24D15DB000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "9CA3D11F018BEEB719CDB34BE800BF1641350D0A" }, { "b" : "7F24D13D6000", "path" : "/lib64/libattr.so.1", "elfType" : 3, "buildId" : "2617ECC6738047E207AE3ADD990BD6A34D11B265" }, { "b" : "7F24D11D0000", "path" : "/lib64/libcap-ng.so.0", "elfType" : 3, "buildId" : "43578677DF613E9D58128ED4AE0C344FBC1E44C0" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x55ca8cd9c541]
       mongod(+0x254E759) [0x55ca8cd9b759]
       mongod(+0x254EC3D) [0x55ca8cd9bc3d]
       libpthread.so.0(+0xF5F0) [0x7f24d60735f0]
       libc.so.6(gsignal+0x37) [0x7f24d5ccc337]
       libc.so.6(abort+0x148) [0x7f24d5ccda28]
       mongod(_ZN5mongo24invariantOKFailedWithMsgEPKcRKNS_6StatusERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES1_j+0x0) [0x55ca8b29e516]
       mongod(_ZN5mongo21WiredTigerRecordStore7compactEPNS_16OperationContextEPNS_25RecordStoreCompactAdaptorEPKNS_14CompactOptionsEPNS_12CompactStatsE+0xBA) [0x55ca8b6c095a]
       mongod(_ZN5mongo14CollectionImpl7compactEPNS_16OperationContextEPKNS_14CompactOptionsE+0x1A8) [0x55ca8bad6698]
       mongod(_ZN5mongo10CompactCmd9errmsgRunEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKNS_7BSONObjERS8_RNS_14BSONObjBuilderE+0x96B) [0x55ca8b93d4bb]
       mongod(_ZN5mongo23ErrmsgCommandDeprecated3runEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKNS_7BSONObjERNS_14BSONObjBuilderE+0x46) [0x55ca8c7d88e6]
       mongod(_ZN5mongo12BasicCommand10Invocation3runEPNS_16OperationContextEPNS_19CommandReplyBuilderE+0xD9) [0x55ca8c7df2f9]
       mongod(+0xB92D6E) [0x55ca8b3dfd6e]
      mongod(+0xB94C69) [0x55ca8b3e1c69]
       mongod(_ZN5mongo23ServiceEntryPointCommon13handleRequestEPNS_16OperationContextERKNS_7MessageERKNS0_5HooksE+0x3D1) [0x55ca8b3e2bb1]
       mongod(_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE+0x3A) [0x55ca8b3ce85a]
       mongod(_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE+0xBA) [0x55ca8b3daf8a]
       mongod(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0x97) [0x55ca8b3d61e7]
       mongod(+0xB8CA01) [0x55ca8b3d9a01]
       mongod(_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE+0x1A2) [0x55ca8c5c86f2]
       mongod(_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE+0x150) [0x55ca8b3d43d0]
       mongod(_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE+0xB55) [0x55ca8b3d7515]
       mongod(_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE+0x357) [0x55ca8b3d5927]
       mongod(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0x11D) [0x55ca8b3d626d]
       mongod(+0xB8CA01) [0x55ca8b3d9a01]
       mongod(+0x1D7BC55) [0x55ca8c5c8c55]
       mongod(+0x24A69D4) [0x55ca8ccf39d4]
       libpthread.so.0(+0x7E65) [0x7f24d606be65]
       libc.so.6(clone+0x6D) [0x7f24d5d9488d]
      -----  END BACKTRACE  -----
      

      However, based on session_compact.c around line 270, we should just quit if eviction is a problem:

       /*
                   * If compaction failed because checkpoint was running, continue with the next handle.
                   * We might continue to race with checkpoint on each handle, but that's OK, we'll step
                   * through all the handles, and then we'll block until a checkpoint completes.
                   * 
                   * Just quit if eviction is the problem. 
                   */
                  if (ret == EBUSY) {                 
                      if (__wt_cache_stuck(session)) {
                          WT_ERR_MSG(session, EBUSY,
                            "compaction halted by eviction "
                            "pressure");
                      }
                      ret = 0;
                      another_pass = true;
                  }
                  WT_ERR(ret);
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              gregory.wlodarek Gregory Wlodarek
              Reporter:
              dmitry.agranat Dmitry Agranat
              Participants:
              Votes:
              1 Vote for this issue
              Watchers:
              39 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: