Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22561

RangeDeleter crashes PRIMARY in a very bad decision

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • ALL

      Primary, arbiter, secondary. Secondary is far behind for >1hr, for whatever reason. The range deleter asserts crashes the primary.

      When crashing the primary, the secondary will take place. But its far behind, 1 hour - by definition. Is it smart to force its election...?

      Are you doing any testing to sharded replicated clusters? no way this code path was ever tested. Sorry.

      2016-02-10T17:14:49.709+0000 I SHARDING [RangeDeleter] rangeDeleter took 3600 seconds waiting for deletes to be replicated to majority nodes 2016-02-10T17:14:49.722+0000 I - [RangeDeleter] Fatal assertion 18512 WriteConcernFailed waiting for replication timed out 2016-02-10T17:14:49.905+0000 I CONTROL [RangeDeleter] 0x12d5772 0x12713d4 0x125d662 0xdf123e 0xdf2865 0x1a99100 0x7f4d75244df3 0x7f4d74f721bd ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"400000","o":"ED5772"},{"b":"400000","o":"E713D4"},{"b":"400000","o":"E5D662"},{"b":"400000","o":"9F123E"},{"b":"400000","o":"9F2865"},{"b":"400000","o":"1699100"},{"b":"7F4D7523D000","o":"7DF3"},{"b":"7F4D74E7C000","o":"F61BD"}],"processInfo":{ "mongodbVersion" : "3.2.1", "gitVersion" : "a14d55980c2cdc565d4704a7e3ad37e4e535c1b2", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.14.19-17.43.amzn1.x86_64", "version" : "#1 SMP Wed Sep 17 22:14:52 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "A931E563057BCC34ED2D1971AAC77D44F47033A9" }, { "b" : "7FFF38CFE000", "elfType" : 3, "buildId" : "8E3D893F8991DFE6C5D9AB55196714E9AF81DC88" }, { "b" : "7F4D7646A000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "22480480235F3B1C6C2E5E5953949728676D3796" }, { "b" : "7F4D76085000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "ADD80D7DBE8B04C3BA8E3242D96F39FF870A862A" }, { "b" : "7F4D75E7D000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "E81013CBFA409053D58A65A0653271AB665A4619" }, { "b" : "7F4D75C79000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "62A8842157C62F95C3069CBF779AFCC26577A99A" }, { "b" : "7F4D75970000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "66F1CF311C61879639BD3DC0034DEE0D6D042261" }, { "b" : "7F4D7566E000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "5F97F8F8E5024E29717CF35998681F84D4A22D45" }, { "b" : "7F4D75459000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "E77BA674F63D5C56373C03316B5E74C5C781A0BC" }, { "b" : "7F4D7523D000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "D48D3E6672A77B603B402F661BABF75E90AD570B" }, { "b" : "7F4D74E7C000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "DF6DA145A649EA093507A635AF383F608E7CE3F2" }, { "b" : "7F4D766D7000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "6F90843B9087FE91955FEB0355EB0858EF9E97B2" }, { "b" : "7F4D74C39000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "DE5A9F7A11A0881CB64E375F4DDCA58028F0FAF8" }, { "b" : "7F4D74954000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "A3E43FC66908AC8B00773707FECA3B1677AFF311" }, { "b" : "7F4D74751000", "path" : "/usr/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "622F315EB5CB2F791E9B64020692EBA98195D06D" }, { "b" : "7F4D74526000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "B10FBFEC246C4EAD1719D16090D0BE54904BBFC9" }, { "b" : "7F4D7430F000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "E492542502DF88A2F752AD77D1905D13FF1AC6FF" }, { "b" : "7F4D74104000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "7292C0673D7C116E3389D3FFA67087A6B9287A71" }, { "b" : "7F4D73F01000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "BF48CD5658DE95CE058C4B828E81C97E2AE19643" }, { "b" : "7F4D73CE7000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "6A7DA1CED90F65F27CB7B5BACDBB1C386C05F592" }, { "b" : "7F4D73AC6000", "path" : "/usr/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "803D7EF21A989677D056E52BAEB9AB5B154FB9D9" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x12d5772] mongod(_ZN5mongo10logContextEPKc+0x134) [0x12713d4] mongod(_ZN5mongo23fassertFailedWithStatusEiRKNS_6StatusE+0x62) [0x125d662] mongod(+0x9F123E) [0xdf123e] mongod(_ZN5mongo12RangeDeleter6doWorkEv+0x245) [0xdf2865] mongod(+0x1699100) [0x1a99100] libpthread.so.0(+0x7DF3) [0x7f4d75244df3] libc.so.6(clone+0x6D) [0x7f4d74f721bd] ----- END BACKTRACE ----- 2016-02-10T17:14:49.907+0000 I - [RangeDeleter] ***aborting after fassert() failure
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            yonido Yoni Douek
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: