Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32432

Race condition causes seg fault in ReplicationCoordinatorExternalStateImpl shutdown

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.3, 3.7.1
    • Affects Version/s: None
    • Component/s: Replication
    • Fully Compatible
    • ALL
    • v3.6
    • Repl 2018-01-15, Repl 2018-01-29
    • 0

      In ReplicationCoordinatorExternalStateImpl, we begin running a SyncSourceFeedback on a background thread, passing it a pointer to a BackgroundSync object that we hold via unique_ptr. On shutdown, we std::move our BackgroundSync object before we call shutdown() on _syncSourceFeedback. If the SyncSourceFeedback run() loop is in the wrong place at this time, it will try to deference its BackgroundSync pointer and seg fault.

      A fix would be to use a shared_ptr here to hold the BackgroundSync object, or to re-order the shutdown logic in the replication coordinator to eliminate this race.

      I saw this segmentation fault in a patch build with the following trace:

      [js_test:config_server_checks] 2017-12-20T19:41:35.701+0000 c20010| 2017-12-20T19:41:34.397+0000 I REPL     [signalProcessingThread] Stopping replication reporter thread
      [js_test:config_server_checks] 2017-12-20T19:41:35.705+0000 c20010| 2017-12-20T19:41:34.397+0000 F -        [SyncSourceFeedback] Invalid access at address: 0x30
      [js_test:config_server_checks] 2017-12-20T19:41:35.714+0000 c20010| 2017-12-20T19:41:34.397+0000 I NETWORK  [conn1] Error sending response to client: SocketException: Broken pipe. Ending connection from 127.0.0.1:55862 (connection id: 1)
      [js_test:config_server_checks] 2017-12-20T19:41:35.717+0000 c20010| 2017-12-20T19:41:34.397+0000 I NETWORK  [conn1] end connection 127.0.0.1:55862 (2 connections now open)
      [js_test:config_server_checks] 2017-12-20T19:41:35.732+0000 c20010| 2017-12-20T19:41:34.408+0000 F -        [SyncSourceFeedback] Got signal: 11 (Segmentation fault).
      [js_test:config_server_checks] 2017-12-20T19:41:35.739+0000 c20010|
      [js_test:config_server_checks] 2017-12-20T19:41:35.753+0000 c20010|  0x7f56a8ecefe1 0x7f56a8ece1f9 0x7f56a8ece866 0x7f56a3f3d7e0 0x7f56a3f37470 0x7f56a7ac8e98 0x7f56a7a2db2b 0x7f56a8fdca20 0x7f56a3f35aa1 0x7f56a3c82bcd
      [js_test:config_server_checks] 2017-12-20T19:41:35.756+0000 c20010| ----- BEGIN BACKTRACE -----
      [js_test:config_server_checks] 2017-12-20T19:41:35.807+0000 c20010| {"backtrace":[{"b":"7F56A6BA7000","o":"2327FE1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F56A6BA7000","o":"23271F9"},{"b":"7F56A6BA7000","o":"2327866"},{"b":"7F56A3F2E000","o":"F7E0"},{"b":"7F56A3F2E000","o":"9470","s":"pthread_mutex_lock"},{"b":"7F56A6BA7000","o":"F21E98","s":"_ZNK5mongo4repl14BackgroundSync13getSyncTargetEv"},{"b":"7F56A6BA7000","o":"E86B2B","s":"_ZN5mongo4repl18SyncSourceFeedback3runEPNS_8executor12TaskExecutorEPNS0_14BackgroundSyncEPNS0_22ReplicationCoordinatorE"},{"b":"7F56A6BA7000","o":"2435A20"},{"b":"7F56A3F2E000","o":"7AA1"},{"b":"7F56A3B9A000","o":"E8BCD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.7.0-322-g0b7976614d-patch-5a3aad36e3c331171c00095f", "gitVersion" : "0b7976614d028105a203147fe571b3c264e920b3", "compiledModules" : [ "enterprise" ], "uname" : { "sysname" : "Linux", "release" : "2.6.32-220.el6.x86_64", "version" : "#1 SMP Wed Nov 9 08:03:13 EST 2011", "machine" : "x86_64" }, "somap" : [ { "b" : "7F56A6BA7000", "elfType" : 3, "buildId" : "3C44E2CD7A5B8F61C8248BD49879A6D3F4BA1C5F" }, { "b" : "7FFFBA2FF000", "elfType" : 3, "buildId" : "08F634A1D22DEFF00461D50A7699DACDC97657BF" }, { "b" : "7F56A6738000", "path" : "/usr/lib64/libnetsnmpagent.so.20", "elfType" : 3, "buildId" : "1270BB069D761BD79C79F8986BB3ED5DCAA7D06D" }, { "b" : "7F56A6512000", "path" : "/usr/lib64/libnetsnmphelpers.so.20", "elfType" : 3, "buildId" : "3FA4F246A7DF00EC1355C5226C9308DC7B4AB5CD" }, { "b" : "7F56A604A000", "path" : "/usr/lib64/libnetsnmpmibs.so.20", "elfType" : 3, "buildId" : "AE65092368DDB948A32B62D613DD8FFE210EBEB9" }, { "b" : "7F56A5D6F000", "path" : "/usr/lib64/libnetsnmp.so.20", "elfType" : 3, "buildId" : "52E4D411A95E6C7FCCE0E1942B525AC8FBBDF4A8" }, { "b" : "7F187131E000", "path" : "/lib64/libldap-2.4.so.2", "elfType" : 3, "buildId" : "DDBAC283102A61D6A63B3F3952A1C06657FF3AE8" }, { "b" : "7F187150F000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "244D2593BDE4FE657BC88572DB5DA88FA274B7F3" }, { "b" : "7F1871EF5000", "path" : "/usr/lib64/libsasl2.so.2", "elfType" : 3, "buildId" : "E0AEE889D5BF1373F2F9EE0D448DBF3F5B5113F0" }, { "b" : "7F18720B1000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "0C249DF4D77989253CCD859956BF50749308A16A" }, { "b" : "7F56A525C000", "path" : "/usr/lib64/libcurl.so.4", "elfType" : 3, "buildId" : "A38B9CE8AEAF277CBD8BC1298B1731E2C9A66192" }, { "b" : "7F1876042000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "F0BE1166EDCFFB2422B940D601A1BBD89352D80F" }, { "b" : "7F56A4DD6000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "D256E285C5E11D9A99EB04CA7651003A8F67B64E" }, { "b" : "7F56A49F1000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "1EDB45C205A844A75EBBB4F0075E705803FFB85B" }, { "b" : "7F1876BE9000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "FDF3A36FFFE08375456D59DA959EAB2FC30B6186" }, { "b" : "7F18779E5000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "1F7E85410384392BC51FA7324961719A10125F31" }, { "b" : "7F1876361000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "8A852AC42F0B64F0F30C760EBBCFA3FE4A228F12" }, { "b" : "7F187554B000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "BC7550A8A7C2D706FE4E489058BADC963465DBB7" }, { "b" : "7F1876B2E000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "85104ECFE42C606B31C2D0D0D2E5DACD3286A341" }, { "b" : "7F1876B9A000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8A7E7404A2335231BE759CB54F8041344CAC0C1B" }, { "b" : "7F187A184000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "1CC2165E019D43F71FDE0A47AF9F4C8EB5E51963" }, { "b" : "7F56A398F000", "path" : "/lib64/libwrap.so.0", "elfType" : 3, "buildId" : "083332F88CF3C61AB0184D8F397FC8BFF4548D8E" }, { "b" : "7F1875E24000", "path" : "/usr/lib64/perl5/CORE/libperl.so", "elfType" : 3, "buildId" : "53842C2896DED0063E1BE5C650CE97C67AE97973" }, { "b" : "7F1873C0B000", "path" : "/lib64/libnsl.so.1", "elfType" : 3, "buildId" : "D233CCCC987214EE5DACCF88949E31469228F6FF" }, { "b" : "7F1872DD4000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "F542C8ACD4AD1F2C6A551043BDFBAB051905DA1C" }, { "b" : "7F18753D1000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "2963FF1BBF4BF9131097982EB8BE5C905A342CBD" }, { "b" : "7F186F966000", "path" : "/usr/lib64/librpm.so.1", "elfType" : 3, "buildId" : "C65174824A80EDE5374CFF6143C808807160CA63" }, { "b" : "7F1870737000", "path" : "/usr/lib64/librpmio.so.1", "elfType" : 3, "buildId" : "F858A331FA080C7E82549BE3191EB4BADE02A5C0" }, { "b" : "7F187392E000", "path" : "/lib64/libpopt.so.0", "elfType" : 3, "buildId" : "E7B49911F1136073DD7DC58E8118CD9A4FBE2A19" }, { "b" : "7F1874F18000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "D053BB4FF0C2FC983842F81598813B9B931AD0D1" }, { "b" : "7F56A2508000", "path" : "/usr/lib64/libsensors.so.4", "elfType" : 3, "buildId" : "6855E5BF5B3634C15F01B1043BD892D727EE3C08" }, { "b" : "7F1870EBB000", "path" : "/usr/lib64/libssl3.so", "elfType" : 3, "buildId" : "C5EB2766ABF9ACE9E4556548DC04A37131788870" }, { "b" : "7F187088F000", "path" : "/usr/lib64/libsmime3.so", "elfType" : 3, "buildId" : "6842A55418527250648A1836541354C79613F8BD" }, { "b" : "7F1870D4C000", "path" : "/usr/lib64/libnss3.so", "elfType" : 3, "buildId" : "9221B9CD4B38C4C3FE22B82AA65E2405860E79CA" }, { "b" : "7F187231F000", "path" : "/usr/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "F1484D8815EFE9CC47C437AE0AA7A89A3B5A3A24" }, { "b" : "7F187191B000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "21B62D06504B5AC5A7A849E7C8B919DF357EBEFE" }, { "b" : "7F1870F16000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "83EB817989559AE1CBAE20564AAAB42D61532D9E" }, { "b" : "7F18720D8000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "993E6315CCFCEA516F5A0F993632DFE1A4A395A4" }, { "b" : "7F186EDF1000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "624C7056B8BBE6BA758DEF557F516FBDBD01E1FD" }, { "b" : "7F186DFC5000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "C81673692EEF670BC951EE726490F5D1CAB822F4" }, { "b" : "7F18721C1000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "088FB9EC41563FE043C14CA969FB38468B647B2E" }, { "b" : "7F186DFB6000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "03B69EEB8998AC9CA7519A27571BAD976BA4C56D" }, { "b" : "7F186EDB3000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "3BCCABE75DC61BBA81AAE45D164E26EF4F9F55DB" }, { "b" : "7F186CB81000", "path" : "/lib64/libidn.so.11", "elfType" : 3, "buildId" : "5659EB985475B586E3BBCB95BA21F4A30BE5EBF4" }, { "b" : "7F56A0559000", "path" : "/usr/lib64/libssh2.so.1", "elfType" : 3, "buildId" : "8727EC925D6D91DAC74A99BDE8B3C6EE96AF13EA" }, { "b" : "7F186F756000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "AFF1C795A3CF422C9F8AC32C7522F6376B1EA087" }, { "b" : "7F186FD45000", "path" : "/lib64/libbz2.so.1", "elfType" : 3, "buildId" : "1250B1D041DD7552F0C870BB188DC3A34DF2651D" }, { "b" : "7F186E72F000", "path" : "/usr/lib64/libelf.so.1", "elfType" : 3, "buildId" : "50517407A07B8D6C9A55A392E99246B52E8BFEEA" }, { "b" : "7F186F10E000", "path" : "/usr/lib64/liblzma.so.0", "elfType" : 3, "buildId" : "6FF9BAEEEE9DDEEF2DFA5CBD36147A75891C0AD4" }, { "b" : "7F186CEE1000", "path" : "/usr/lib64/liblua-5.1.so", "elfType" : 3, "buildId" : "6BDB4E1990D6EBA12A5C8D39A7650DB8798BF568" }, { "b" : "7F18714C2000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "B4576BE308DDCF7BC31F7304E4734C3D846D0236" }, { "b" : "7F186CEBE000", "path" : "/lib64/libcap.so.2", "elfType" : 3, "buildId" : "A436538388F1F25113FDA834CA2EED524EFA17D6" }, { "b" : "7F186D8B6000", "path" : "/lib64/libacl.so.1", "elfType" : 3, "buildId" : "26CC708AC7C0FC1797A2340C024F0ADD0CE054D8" }, { "b" : "7F186DD41000", "path" : "/lib64/libdb-4.7.so", "elfType" : 3, "buildId" : "D91C702275E2039E98E39925B02FF5C53A6C3312" }, { "b" : "7F186F33C000", "path" : "/lib64/libattr.so.1", "elfType" : 3, "buildId" : "8EF0683858704EF173AB11B1E27076F37F82B7B6" }, { "b" : "7F569ED37000", "path" : "/usr/lib64/sasl2/liblogin.so", "elfType" : 3, "buildId" : "9D19F93E342AA4EE2D646E64642625F365056E5C" }, { "b" : "7F569EB31000", "path" : "/usr/lib64/sasl2/libsasldb.so", "elfType" : 3, "buildId" : "4514552B5354286A143770420B38F2D5985D7FA1" }, { "b" : "7F569E924000", "path" : "/usr/lib64/sasl2/libdigestmd5.so", "elfType" : 3, "buildId" : "34D8E3E2565DEF4A685D6976831B0372AD456993" }, { "b" : "7F569E71F000", "path" : "/usr/lib64/sasl2/libanonymous.so", "elfType" : 3, "buildId" : "EEAA33A75735D35F4BF25C3C2830B8C90ABDD8B5" }, { "b" : "7F569E517000", "path" : "/usr/lib64/sasl2/libgssapiv2.so", "elfType" : 3, "buildId" : "F7BCE9C6BFF4EAF0CB3142B299CF22D094CE4F04" }, { "b" : "7F569E311000", "path" : "/usr/lib64/sasl2/libcrammd5.so", "elfType" : 3, "buildId" : "4CC7E695963F5C8B772EDFF456DB67F89E58FBD6" }, { "b" : "7F569E10C000", "path" : "/usr/lib64/sasl2/libplain.so", "elfType" : 3, "buildId" : "F8DDC7A3CA1CE5B75719AE0DC821647B609D17B6" } ] }}
      [js_test:config_server_checks] 2017-12-20T19:41:35.810+0000 c20010|  mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f56a8ecefe1]
      [js_test:config_server_checks] 2017-12-20T19:41:35.828+0000 c20010|  mongod(+0x23271F9) [0x7f56a8ece1f9]
      [js_test:config_server_checks] 2017-12-20T19:41:35.832+0000 c20010|  mongod(+0x2327866) [0x7f56a8ece866]
      [js_test:config_server_checks] 2017-12-20T19:41:35.835+0000 c20010|  libpthread.so.0(+0xF7E0) [0x7f56a3f3d7e0]
      [js_test:config_server_checks] 2017-12-20T19:41:35.838+0000 c20010|  libpthread.so.0(pthread_mutex_lock+0x0) [0x7f56a3f37470]
      [js_test:config_server_checks] 2017-12-20T19:41:35.847+0000 c20010|  mongod(_ZNK5mongo4repl14BackgroundSync13getSyncTargetEv+0x48) [0x7f56a7ac8e98]
      [js_test:config_server_checks] 2017-12-20T19:41:35.855+0000 c20010|  mongod(_ZN5mongo4repl18SyncSourceFeedback3runEPNS_8executor12TaskExecutorEPNS0_14BackgroundSyncEPNS0_22ReplicationCoordinatorE+0x4EB) [0x7f56a7a2db2b]
      [js_test:config_server_checks] 2017-12-20T19:41:35.861+0000 c20010|  mongod(+0x2435A20) [0x7f56a8fdca20]
      [js_test:config_server_checks] 2017-12-20T19:41:35.863+0000 c20010|  libpthread.so.0(+0x7AA1) [0x7f56a3f35aa1]
      [js_test:config_server_checks] 2017-12-20T19:41:35.872+0000 c20010|  libc.so.6(clone+0x6D) [0x7f56a3c82bcd]
      [js_test:config_server_checks] 2017-12-20T19:41:35.874+0000 c20010| -----  END BACKTRACE  -----
      [js_test:config_server_checks] 2017-12-20T19:41:35.954+0000 2017-12-20T19:41:35.949+0000 I -        [thread1] shell: stopped mongo program on port 20010
      [js_test:config_server_checks] 2017-12-20T19:41:35.960+0000 2017-12-20T19:41:35.949+0000 E QUERY    [thread1] StopError: MongoDB process stopped with exit code: -11 :
      [js_test:config_server_checks] 2017-12-20T19:41:35.963+0000 StopError: MongoDB process stopped with exit code: -11
      [js_test:config_server_checks] 2017-12-20T19:41:35.964+0000 MongoRunner.StopError@src/mongo/shell/servers.js:800:48
      [js_test:config_server_checks] 2017-12-20T19:41:35.966+0000 MongoRunner.stopMongod@src/mongo/shell/servers.js:888:19
      [js_test:config_server_checks] 2017-12-20T19:41:35.974+0000 ReplSetTest/this.stop@src/mongo/shell/replsettest.js:1922:19
      [js_test:config_server_checks] 2017-12-20T19:41:35.978+0000 ReplSetTest/this.stopSet@src/mongo/shell/replsettest.js:1943:13
      [js_test:config_server_checks] 2017-12-20T19:41:35.985+0000 @jstests/replsets/config_server_checks.js:41:9
      [js_test:config_server_checks] 2017-12-20T19:41:35.988+0000 @jstests/replsets/config_server_checks.js:22:6
      [js_test:config_server_checks] 2017-12-20T19:41:35.991+0000 @jstests/replsets/config_server_checks.js:19:2
      [js_test:config_server_checks] 2017-12-20T19:41:36.015+0000 failed to load: jstests/replsets/config_server_checks.js
      

            Assignee:
            siyuan.zhou@mongodb.com Siyuan Zhou
            Reporter:
            samantha.ritter@mongodb.com Samantha Ritter (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: