Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36841

TTL thread can (very rarely) race with shutdown and crash the server

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7.0
    • Component/s: Storage
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Execution Team 2020-04-20, Execution Team 2020-05-04
    • Linked BF Score:
      45

      Description

      The TTL thread has an OperationContext, but takes no locks. Therefore, it can race with shutdown of the storage engine, which normally waits for operations to finish by acquiring the global X lock. Then, TTL thread's OperationContext's RecoveryUnit destructor can accesses the storage engine and crash like so:

      [ShardedClusterFixture:job9] Successfully stopped mongos.
      [ShardedClusterFixture:job9] Stopping shard...
      [ShardedClusterFixture:job9:shard0] Stopping mongod on port 22251 with pid 22574...
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.549+0000 I CONTROL  [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.549+0000 I NETWORK  [signalProcessingThread] shutdown: going to close listening sockets...
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.549+0000 I NETWORK  [signalProcessingThread] removing socket file: /tmp/mongodb-22251.sock
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.550+0000 I ASIO     [signalProcessingThread] Dropping all pooled connections to localhost:22250 due to ShutdownInProgress: Shuting down the connection pool
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.553+0000 W SHARDING [signalProcessingThread] error encountered while cleaning up distributed ping entry for rhel71-ppc-3.pic.build.10gen.cc:22251:1531671881:178497171568223908 :: caused by :: ShutdownInProgress: Shutdown in progress
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.556+0000 W SHARDING [signalProcessingThread] cant reload ShardRegistry  :: caused by :: CallbackCanceled: Callback canceled
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.561+0000 I FTDC     [signalProcessingThread] Shutting down full-time diagnostic data capture
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.564+0000 I STORAGE  [signalProcessingThread] WiredTigerKVEngine shutting down
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.565+0000 E INDEX    [TTLMonitor] Error processing ttl index: { v: 2, key: { lastUse: 1 }, name: "lsidTTLIndex", ns: "config.system.sessions", expireAfterSeconds: 1800 } -- InterruptedAtShutdown: interrupted at shutdown
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.565+0000 I STORAGE  [TTLMonitor] WiredTigerSizeStorer magic wrong: 11111
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.565+0000 F -        [TTLMonitor] Invariant failure _magic == MAGIC src/mongo/db/storage/wiredtiger/wiredtiger_size_storer.cpp 88
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.565+0000 F -        [TTLMonitor]
      [ShardedClusterFixture:job9:shard0] 
      [ShardedClusterFixture:job9:shard0] ***aborting after invariant() failure
      [ShardedClusterFixture:job9:shard0] 
      [ShardedClusterFixture:job9:shard0] 
      [ShardedClusterFixture:job9:shard0] 2018-07-15T16:26:22.579+0000 F -        [TTLMonitor] Got signal: 6 (Aborted).
      [ShardedClusterFixture:job9:shard0]  0x48d68e70 0x48d67c94 0x48d682f4 0x3fff85c60478 0x3fff84cc072c 0x46e0acf4 0x471de0d4 0x471df648 0x471b334c 0x471baf80 0x471dbf10 0x471d88b8 0x471d8938 0x48c6f430 0x473a369c 0x473a3c78 0x48c646fc 0x48eb6c24 0x3fff84e68944 0x3fff84d97640
      [ShardedClusterFixture:job9:shard0] ----- BEGIN BACKTRACE -----
      [ShardedClusterFixture:job9:shard0] {"backtrace":[{"b":"46400000","o":"2968E70","s":"_ZN5mongo15printStackTraceERSo"},{"b":"46400000","o":"2967C94"},{"b":"46400000","o":"29682F4"},{"b":"3FFF85C60000","o":"478","s":"__kernel_sigtramp_rt64"},{"b":"3FFF84C80000","o":"4072C","s":"abort"},{"b":"46400000","o":"A0ACF4","s":"_ZN5mongo15invariantFailedEPKcS1_j"},{"b":"46400000","o":"DDE0D4","s":"_ZNK5mongo20WiredTigerSizeStorer11_checkMagicEv"},{"b":"46400000","o":"DDF648","s":"_ZN5mongo20WiredTigerSizeStorer9syncCacheEb"},{"b":"46400000","o":"DB334C","s":"_ZNK5mongo18WiredTigerKVEngine12syncSizeInfoEb"},{"b":"46400000","o":"DBAF80","s":"_ZNK5mongo18WiredTigerKVEngine15haveDropsQueuedEv"},{"b":"46400000","o":"DDBF10","s":"_ZN5mongo22WiredTigerSessionCache14releaseSessionEPNS_17WiredTigerSessionE"},{"b":"46400000","o":"DD88B8","s":"_ZN5mongo22WiredTigerRecoveryUnitD1Ev"},{"b":"46400000","o":"DD8938","s":"_ZN5mongo22WiredTigerRecoveryUnitD0Ev"},{"b":"46400000","o":"286F430","s":"_ZNK5mongo14ServiceContext23OperationContextDeleterclEPNS_16OperationContextE"},{"b":"46400000","o":"FA369C","s":"_ZN5mongo10TTLMonitor9doTTLPassEv"},{"b":"46400000","o":"FA3C78","s":"_ZN5mongo10TTLMonitor3runEv"},{"b":"46400000","o":"28646FC","s":"_ZN5mongo13BackgroundJob7jobBodyEv"},{"b":"46400000","o":"2AB6C24"},{"b":"3FFF84E60000","o":"8944"},{"b":"3FFF84C80000","o":"117640","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.7.9-130-gcf1cb92", "gitVersion" : "cf1cb92a0d1c7ca4156cc65e8af440f4a8678219", "compiledModules" : [ "enterprise" ], "uname" : { "sysname" : "Linux", "release" : "3.10.0-229.ael7b.ppc64le", "version" : "#1 SMP Fri Jan 30 12:03:50 EST 2015", "machine" : "ppc64le" }, "somap" : [ { "b" : "46400000", "elfType" : 3, "buildId" : "469FC351D7C70B319E47E65F0CAD5AAF79DCAD2B" }, { "b" : "3FFF85C60000", "elfType" : 3, "buildId" : "179B6FB935AA023578009C7C3C0B7F09BC50ACB9" }, { "b" : "3FFF85920000", "path" : "/lib64/libnetsnmpmibs.so.31", "elfType" : 3, "buildId" : "ED5B3DACDB1C89AAD56DCCC28E56D547B1BA79CD" }, { "b" : "3FFF858F0000", "path" : "/lib64/libsensors.so.4", "elfType" : 3, "buildId" : "8D1FADE0D56540352C0C0F4CE308D96BA3F85E70" }, { "b" : "3FFF858C0000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "57734949376D241AF73F054863FB53E1F93D222D" }, { "b" : "3FFF85820000", "path" : "/lib64/librpm.so.3", "elfType" : 3, "buildId" : "13C0FC9BB78549D8B7849AFFEF78B1C7C9795C9F" }, { "b" : "3FFF857C0000", "path" : "/lib64/librpmio.so.3", "elfType" : 3, "buildId" : "8327016318024432B934A0FC928DDCECE10312E9" }, { "b" : "3FFF85710000", "path" : "/lib64/libnetsnmpagent.so.31", "elfType" : 3, "buildId" : "7101FE545A10E550BEC342129449705A1A728804" }, { "b" : "3FFF856E0000", "path" : "/lib64/libwrap.so.0", "elfType" : 3, "buildId" : "F5F1A860B171505B7A8E2AFDEFB3325E8E66231B" }, { "b" : "3FFF85590000", "path" : "/lib64/libnetsnmp.so.31", "elfType" : 3, "buildId" : "516868CBB10374155A384239D1A189CD248A1743" }, { "b" : "3FFF854F0000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "E46B75B46C31B1E307D0C04DE48C42E91BBF6219" }, { "b" : "3FFF85270000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "AB6F9785D492FDF0F4A2AC06C7D08DA8889E0C36" }, { "b" : "3FFF851F0000", "path" : "/lib64/libldap-2.4.so.2", "elfType" : 3, "buildId" : "05EEB4AD5E86B9B75EA521B67F6BD98EC5B1896B" }, { "b" : "3FFF851B0000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "ED908B8E82E86713B780510C94F37E8D0DA97D05" }, { "b" : "3FFF85160000", "path" : "/lib64/libsasl2.so.3", "elfType" : 3, "buildId" : "2559B3B52A1555DD3CB244ABB24C0F2A9C08B366" }, { "b" : "3FFF850E0000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "3CDBA9444A5FC8958F5749ACF83E8B51DA626FBF" }, { "b" : "3FFF85040000", "path" : "/lib64/libcurl.so.4", "elfType" : 3, "buildId" : "3DDFB615CF269A8A71319B78D464E2B762D87D9B" }, { "b" : "3FFF84F50000", "path" : "/lib64/power8/libm.so.6", "elfType" : 3, "buildId" : "0CAADB7D836B16CDD176833A0DAE38764E28006F" }, { "b" : "3FFF84F10000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "8EE9D450F181F791C2FB0C30DE3112E7BEF26E3A" }, { "b" : "3FFF84EE0000", "path" : "/lib64/power8/librt.so.1", "elfType" : 3, "buildId" : "626D07527A63941827A52A38E928E1C6807D3821" }, { "b" : "3FFF84EA0000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "4E2CAE4975DA8B74EA3059838BDD2ED50EA5EB2A" }, { "b" : "3FFF84E60000", "path" : "/lib64/power8/libpthread.so.0", "elfType" : 3, "buildId" : "4C0A14BCEE88F2C83BC28A4ADF87DD3A74AD853F" }, { "b" : "3FFF84C80000", "path" : "/lib64/power8/libc.so.6", "elfType" : 3, "buildId" : "552EC71564694CA8066C907BD54328EE0C250BCF" }, { "b" : "3FFF85C80000", "path" : "/lib64/ld64.so.2", "elfType" : 3, "buildId" : "9F622117176D7D51DC0F827E25352994B88A5489" }, { "b" : "3FFF84AA0000", "path" : "/usr/lib64/perl5/CORE/libperl.so", "elfType" : 3, "buildId" : "70D11256C4702E4FBAE6B58E876818C6A76476D5" }, { "b" : "3FFF84A60000", "path" : "/lib64/libnsl.so.1", "elfType" : 3, "buildId" : "6B2BF9B1B99F8A27C3DDEA4905D5AD5706232491" }, { "b" : "3FFF84A10000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "97AC40E63A13D4E5404A75A66C57C4F63B7A3425" }, { "b" : "3FFF849E0000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "A06F2B58CA09E0BF147841CBB1673F8A4227E5B4" }, { "b" : "3FFF84860000", "path" : "/lib64/libnss3.so", "elfType" : 3, "buildId" : "073498A05A8D524707E80B0ADDEAA7D747107A63" }, { "b" : "3FFF84820000", "path" : "/lib64/libbz2.so.1", "elfType" : 3, "buildId" : "16DDE2253C5115465E86C884D5A3E6A6A9D47133" }, { "b" : "3FFF847E0000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "39B56D8C7ACB8559B96CA80488A0CC5D60E5215F" }, { "b" : "3FFF847A0000", "path" : "/lib64/libelf.so.1", "elfType" : 3, "buildId" : "AB56674789921A58C89401ABFFA80A6DF0B38114" }, { "b" : "3FFF84740000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "A6A50FDF1BC1473396EEA9135234D1093E5F8F3A" }, { "b" : "3FFF84710000", "path" : "/lib64/libpopt.so.0", "elfType" : 3, "buildId" : "60B618082A24CD52235B29E542B374F68EB35C9E" }, { "b" : "3FFF846C0000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "0157498135739C8EC8A44CB911D45784BEEF66F0" }, { "b" : "3FFF84690000", "path" : "/lib64/libcap.so.2", "elfType" : 3, "buildId" : "FDD9669E60A5F4BB9A0D4AAD8FEACE1729003CCB" }, { "b" : "3FFF84660000", "path" : "/lib64/libacl.so.1", "elfType" : 3, "buildId" : "D95B880186196B51D0A5A28D0A9223F2D9873BEB" }, { "b" : "3FFF84600000", "path" : "/lib64/liblua-5.1.so", "elfType" : 3, "buildId" : "1A0DEC8FB9A2DA209A417A8B36105ECFB4C7D33D" }, { "b" : "3FFF843F0000", "path" : "/lib64/libdb-5.3.so", "elfType" : 3, "buildId" : "9C3A59FEA55DE5E3F95C04CAC7A282E4EB1E0054" }, { "b" : "3FFF842D0000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "D5C531B4F180EACDA6E668517C9C91C2E386B51D" }, { "b" : "3FFF842A0000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "D3778E197923CB441680BF2B0D4BD3FA51CF4B38" }, { "b" : "3FFF84240000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "C9B3E6C2792B6244EF254D26AC44E5E841AD5CA0" }, { "b" : "3FFF841C0000", "path" : "/lib64/libssl3.so", "elfType" : 3, "buildId" : "9A22C14B6BBFBD05EC534B313D772C49BE5311BD" }, { "b" : "3FFF84170000", "path" : "/lib64/libsmime3.so", "elfType" : 3, "buildId" : "33441AEA8B19A242A0D288EEE2423547E1B2243C" }, { "b" : "3FFF84110000", "path" : "/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "560A3EEBE98BEAE169D217DC2EA918E408829C30" }, { "b" : "3FFF840E0000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "6FB273CE4ABD4B2ADCDA8DAED5408688FB2E6BF2" }, { "b" : "3FFF840A0000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "2640F6F5414CCA44E1B62CC76DB68E20F7D409D7" }, { "b" : "3FFF84030000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "675263E4FB3DCC6188EFDC3A2E48B4407A91D1A3" }, { "b" : "3FFF84000000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "0F1B10767A26EFDE3F342985BA62BF06D83845DF" }, { "b" : "3FFF83FD0000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "A1142DE6F702C5B8DA7B0213B1147C8442EF6CD0" }, { "b" : "3FFF83F70000", "path" : "/lib64/libidn.so.11", "elfType" : 3, "buildId" : "F5478CCFB5C18355814D7A0D46E732117DE3506B" }, { "b" : "3FFF83F10000", "path" : "/lib64/libssh2.so.1", "elfType" : 3, "buildId" : "B8881677E2B40446F81DEFD84B9F38389AAE2DF1" }, { "b" : "3FFF83EE0000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "F8C761364FCAFD7071CEBF52BB580EE2B064A441" }, { "b" : "3FFF83E70000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "98C137242BC1780C52C3631508B28018CCF1FD33" }, { "b" : "3FFF83E40000", "path" : "/lib64/libattr.so.1", "elfType" : 3, "buildId" : "5321621DC83CD8C8A066A012ADFF4678A6816AB7" } ] }}
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo15printStackTraceERSo+0x80) [0x48d68e70]
      [ShardedClusterFixture:job9:shard0]  mongod(+0x2967C94) [0x48d67c94]
      [ShardedClusterFixture:job9:shard0]  mongod(+0x29682F4) [0x48d682f4]
      [ShardedClusterFixture:job9:shard0]  (__kernel_sigtramp_rt64+0x0) [0x3fff85c60478]
      [ShardedClusterFixture:job9:shard0]  libc.so.6(abort+0x27C) [0x3fff84cc072c]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo15invariantFailedEPKcS1_j+0x1C4) [0x46e0acf4]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZNK5mongo20WiredTigerSizeStorer11_checkMagicEv+0x134) [0x471de0d4]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo20WiredTigerSizeStorer9syncCacheEb+0xA8) [0x471df648]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZNK5mongo18WiredTigerKVEngine12syncSizeInfoEb+0x2C) [0x471b334c]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZNK5mongo18WiredTigerKVEngine15haveDropsQueuedEv+0x100) [0x471baf80]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo22WiredTigerSessionCache14releaseSessionEPNS_17WiredTigerSessionE+0x190) [0x471dbf10]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo22WiredTigerRecoveryUnitD1Ev+0xD8) [0x471d88b8]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo22WiredTigerRecoveryUnitD0Ev+0x28) [0x471d8938]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZNK5mongo14ServiceContext23OperationContextDeleterclEPNS_16OperationContextE+0x190) [0x48c6f430]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo10TTLMonitor9doTTLPassEv+0x9BC) [0x473a369c]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo10TTLMonitor3runEv+0x3A8) [0x473a3c78]
      [ShardedClusterFixture:job9:shard0]  mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0x15C) [0x48c646fc]
      [ShardedClusterFixture:job9:shard0]  mongod(+0x2AB6C24) [0x48eb6c24]
      [ShardedClusterFixture:job9:shard0]  libpthread.so.0(+0x8944) [0x3fff84e68944]
      [ShardedClusterFixture:job9:shard0]  libc.so.6(clone+0x98) [0x3fff84d97640]
      [ShardedClusterFixture:job9:shard0] -----  END BACKTRACE  -----
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              james.heppenstall James Heppenstall
              Reporter:
              dianna.hohensee Dianna Hohensee
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: