[SERVER-27701] Race in AsyncResultsMerger error handling may trigger mongos invariant Created: 17/Jan/17  Updated: 07/Sep/17  Resolved: 06/Feb/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.4.1
Fix Version/s: 3.2.14, 3.4.4, 3.5.3

Type: Bug Priority: Major - P3
Reporter: Remi Forest Assignee: David Storch
Resolution: Done Votes: 0
Labels: bkp
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongos.log    
Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-28319 Invariant failure with 1000 of fake s... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.4, v3.2
Steps To Reproduce:

I was not able to reproduce so far.

Sprint: Query 2017-02-13
Participants:
Case:
Linked BF Score: 0

 Description   

We were running a stress test on a distributed sharded cluster (8 shards, 7 replicas, on 5 AWS regions), checking behaviour when killing different parts of the cluster.

We setup 5 application servers (1 per region) with mongos on the same server.

One of the mongos from where we were running the tests crashed after we killed some processes (including some config servers).

The message when crashing was :
Invariant failure remote.status.isOK() src/mongo/s/query/async_results_merger.cpp 345

Attached is the mongos log.

Tests started at 15:47 after sharding the collection "renault"



 Comments   
Comment by Githook User [ 26/May/17 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-27701 fix race in AsyncResultsMerger error handling

(cherry picked from commit 5a35a6c425b815ecf7795d8e2706b9f3e82c7ce6)
Branch: v3.2
https://github.com/mongodb/mongo/commit/40d309be939eea757c09d4895b22c24905aa8b05

Comment by Githook User [ 31/Mar/17 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-27701 fix race in AsyncResultsMerger error handling

(cherry picked from commit 5a35a6c425b815ecf7795d8e2706b9f3e82c7ce6)
Branch: v3.4
https://github.com/mongodb/mongo/commit/0b0943d8cecba04d7e177c818a6429ff852650cf

Comment by Githook User [ 06/Feb/17 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-27701 fix race in AsyncResultsMerger error handling
Branch: master
https://github.com/mongodb/mongo/commit/5a35a6c425b815ecf7795d8e2706b9f3e82c7ce6

Comment by David Storch [ 03/Feb/17 ]

Hi remi.forest,

Thanks for reporting this issue. I've tracked down the root cause. It's a race condition in the AsyncResultsMerger, the component that the mongos read path uses to merge the streams of results from the shards into a single stream to return to the client driver. The AsyncResultsMerger is used by the mongos query path's merge stage, RouterStageMerge:

https://github.com/mongodb/mongo/blob/r3.4.1/src/mongo/s/query/router_stage_merge.cpp#L43-L56

When RouterStageMerge is asked to retrieve another query result, it delegates to the AsyncResultsMerger, first calling ARM::ready(). This method call asks whether either there is either an error or a result document available to return to the client. If neither an error nor a document is available, it calls ARM::nextEvent(), which will schedule commands to run against the shards in order to retrieve more results. However, these two function calls happen in two steps, and each will separately acquire and release the AsyncResultsMerger's mutex. This means that an error response from the shards can be processed after ARM::ready() but before ARM::nextEvent()! The failed invariant incorrectly assumes that it is impossible to have an outstanding error in ARM::nextEvent(), since the caller is expected to have checked for an error using ARM::ready().

I was able to reproduce the invariant by applying the following patch to mongos:

diff --git a/src/mongo/s/query/async_results_merger.cpp b/src/mongo/s/query/async_results_merger.cpp
index d6ea954..77ba867 100644
--- a/src/mongo/s/query/async_results_merger.cpp
+++ b/src/mongo/s/query/async_results_merger.cpp
@@ -720,6 +720,13 @@ std::shared_ptr<Shard> AsyncResultsMerger::RemoteCursorData::getShard() {
     }
 }
 
+void AsyncResultsMerger::injectStatusForShards(Status status) {
+    stdx::lock_guard<stdx::mutex> lk(_mutex);
+    for (auto&& remote : _remotes) {
+        remote.status = status;
+    }
+}
+
 //
 // AsyncResultsMerger::MergingComparator
 //
diff --git a/src/mongo/s/query/async_results_merger.h b/src/mongo/s/query/async_results_merger.h
index f1e6c5d..c088902 100644
--- a/src/mongo/s/query/async_results_merger.h
+++ b/src/mongo/s/query/async_results_merger.h
@@ -177,6 +177,8 @@ public:
      */
     executor::TaskExecutor::EventHandle kill();
 
+    void injectStatusForShards(Status status);
+
 private:
     /**
      * We instantiate one of these per remote host. It contains the buffer of results we've
diff --git a/src/mongo/s/query/router_stage_merge.cpp b/src/mongo/s/query/router_stage_merge.cpp
index e3b8fd2..764cb19 100644
--- a/src/mongo/s/query/router_stage_merge.cpp
+++ b/src/mongo/s/query/router_stage_merge.cpp
@@ -42,6 +42,7 @@ RouterStageMerge::RouterStageMerge(executor::TaskExecutor* executor,
 
 StatusWith<ClusterQueryResult> RouterStageMerge::next() {
     while (!_arm.ready()) {
+        _arm.injectStatusForShards({ErrorCodes::OperationFailed, "mock failure"});
         auto nextEventStatus = _arm.nextEvent();
         if (!nextEventStatus.isOK()) {
             return nextEventStatus.getStatus();

This patch mocks the problem scenario by injecting an error into the ARM in between the calls to ARM::next() and ARM::ready(). It reliably triggers the invariant when I run a find with no options against a sharded collection. This seems pretty unlikely to happen in practice under normal operation, but it is definitely something which our team will work on fixing.

Best,
Dave

Comment by Remi Forest [ 17/Jan/17 ]

Yes, but the same type of error happened at 16:27 and 15:59 but without crash (though on different TaskExecutorPool)

Comment by Daniel Pasette (Inactive) [ 17/Jan/17 ]

Looks like there were some connectivity problems before the assertion failure.

2017-01-10T16:27:39.416+0000 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-3-0] Connecting to ip-172-31-7-13.eu-west-2.compute.internal:28004
2017-01-10T16:27:39.428+0000 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-3-0] Failed to connect to ip-172-31-7-13.eu-west-2.compute.internal:28004 - HostUnreachable: Connection refused
2017-01-10T16:27:39.428+0000 I NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-3-0] Marking host ip-172-31-7-13.eu-west-2.compute.internal:28004 as failed :: caused by :: HostUnreachable: Connection refused
2017-01-10T16:27:39.428+0000 I -        [conn3582] Invariant failure remote.status.isOK() src/mongo/s/query/async_results_merger.cpp 345
2017-01-10T16:27:39.428+0000 I -        [conn3582] 
 
***aborting after invariant() failure
 
 
2017-01-10T16:27:39.437+0000 F -        [conn3582] Got signal: 6 (Aborted).
 
 0x55bdac6cd681 0x55bdac6cc779 0x55bdac6ccc5d 0x7f04c4cc5100 0x7f04c49295f7 0x7f04c492ace8 0x55bdabf0929c 0x55bdac5a8818 0x55bdac5bc433 0x55bdac5bbf11 0x55bdac5b016e 0x55bdac5b8c41 0x55bdac5ba880 0x55bdac52d0a0 0x55bdac5d34df 0x55bdac58be34 0x55bdac58d420 0x55bdac586759 0x55bdabf4d77d 0x55bdac63ccd2 0x7f04c4cbddc5 0x7f04c49eac9d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"55BDABB29000","o":"BA4681","s":"_ZN5mongo15printStackTraceERSo"},{"b":"55BDABB29000","o":"BA3779"},{"b":"55BDABB29000","o":"BA3C5D"},{"b":"7F04C4CB6000","o":"F100"},{"b":"7F04C48F4000","o":"355F7","s":"gsignal"},{"b":"7F04C48F4000","o":"36CE8","s":"abort"},{"b":"55BDABB29000","o":"3E029C","s":"_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j"},{"b":"55BDABB29000","o":"A7F818","s":"_ZN5mongo18AsyncResultsMerger9nextEventEv"},{"b":"55BDABB29000","o":"A93433","s":"_ZN5mongo16RouterStageMerge4nextEv"},{"b":"55BDABB29000","o":"A92F11","s":"_ZN5mongo16RouterStageLimit4nextEv"},{"b":"55BDABB29000","o":"A8716E","s":"_ZN5mongo23ClusterClientCursorImpl4nextEv"},{"b":"55BDABB29000","o":"A8FC41"},{"b":"55BDABB29000","o":"A91880","s":"_ZN5mongo11ClusterFind8runQueryEPNS_16OperationContextERKNS_14CanonicalQueryERKNS_21ReadPreferenceSettingEPSt6vectorINS_7BSONObjESaISA_EEPSA_"},{"b":"55BDABB29000","o":"A040A0"},{"b":"55BDABB29000","o":"AAA4DF","s":"_ZN5mongo7Command17execCommandClientEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderE"},{"b":"55BDABB29000","o":"A62E34"},{"b":"55BDABB29000","o":"A64420","s":"_ZN5mongo8Strategy15clientCommandOpEPNS_16OperationContextERNS_7RequestE"},{"b":"55BDABB29000","o":"A5D759","s":"_ZN5mongo7Request7processEPNS_16OperationContextEi"},{"b":"55BDABB29000","o":"42477D","s":"_ZN5mongo23ServiceEntryPointMongos12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE"},{"b":"55BDABB29000","o":"B13CD2"},{"b":"7F04C4CB6000","o":"7DC5"},{"b":"7F04C48F4000","o":"F6C9D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.1", "gitVersion" : "5e103c4f5583e2566a45d740225dc250baacfbd7", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.4.35-33.55.amzn1.x86_64", "version" : "#1 SMP Tue Dec 6 20:30:04 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "b" : "55BDABB29000", "elfType" : 3, "buildId" : "5D8C9D117F82131B933E62A27F586DA0D2A676E0" }, { "b" : "7FFDB2FF2000", "elfType" : 3, "buildId" : "38233CA92BDB22C6A3479C6E427C85F466F9FDFB" }, { "b" : "7F04C5BDC000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "6AF827B6FD200DFDFE70B2BC8D66BBC9881E8817" }, { "b" : "7F04C57F6000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "83F15DBCD0653F417E98354BC1EED6F96A758367" }, { "b" : "7F04C55EE000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "2B3151901240D9E854E18E6D0B181C4D580ABA9C" }, { "b" : "7F04C53EA000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "6335077ACD51527BE9F2F18451A88E2B7350C5B6" }, { "b" : "7F04C50E8000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "6E343508D15886FE83C438DF4560CE40BEB64B56" }, { "b" : "7F04C4ED2000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "3FD5F89DE59E124AB1419A0BD16775B4096E84FD" }, { "b" : "7F04C4CB6000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "0836319AA81CDFE97DA2666963F62DE6A2A61346" }, { "b" : "7F04C48F4000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "5D38A77E8D79E98D717281031C39B9A341323BD1" }, { "b" : "7F04C5E49000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "93D931BA041229929E5F099514B20E36A70BD651" }, { "b" : "7F04C46A8000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "E203354E7F907ACC8C3028FE465541B666DCFBA0" }, { "b" : "7F04C43C3000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "D769C8FFAF8772FDA55031ABF2E167DF2207E378" }, { "b" : "7F04C41C0000", "path" : "/usr/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "5C01209C5AE1B1714F19B07EB58F2A1274B69DC8" }, { "b" : "7F04C3F8E000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "6C2243D37143F7FD1E16ED1F6CE4D7F16C2D7EF1" }, { "b" : "7F04C3D78000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "89C6AF118B6B4FB6A73AE1813E2C8BDD722956D1" }, { "b" : "7F04C3B69000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "744272FEAAABCE629AB9E11FAA4A96AEBE8BC2B4" }, { "b" : "7F04C3966000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "37A58210FA50C91E09387765408A92909468D25B" }, { "b" : "7F04C374C000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "1285F9516FFCF13FC00BD135C5634AF2EB16C80B" }, { "b" : "7F04C352B000", "path" : "/usr/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "F5054DC94443326819FBF3065CFDF5E4726F57EE" }, { "b" : "7F04C3318000", "path" : "/lib64/libnss_files.so.2", "elfType" : 3, "buildId" : "C12ADA079EA7897F6DBA51ED9B8EFDC7395A618B" } ] }}
 mongos(_ZN5mongo15printStackTraceERSo+0x41) [0x55bdac6cd681]
 mongos(+0xBA3779) [0x55bdac6cc779]
 mongos(+0xBA3C5D) [0x55bdac6ccc5d]
 libpthread.so.0(+0xF100) [0x7f04c4cc5100]
 libc.so.6(gsignal+0x37) [0x7f04c49295f7]
 libc.so.6(abort+0x148) [0x7f04c492ace8]
 mongos(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0x0) [0x55bdabf0929c]
 mongos(_ZN5mongo18AsyncResultsMerger9nextEventEv+0x158) [0x55bdac5a8818]
 mongos(_ZN5mongo16RouterStageMerge4nextEv+0x53) [0x55bdac5bc433]
 mongos(_ZN5mongo16RouterStageLimit4nextEv+0x71) [0x55bdac5bbf11]
 mongos(_ZN5mongo23ClusterClientCursorImpl4nextEv+0x35E) [0x55bdac5b016e]
 mongos(+0xA8FC41) [0x55bdac5b8c41]
 mongos(_ZN5mongo11ClusterFind8runQueryEPNS_16OperationContextERKNS_14CanonicalQueryERKNS_21ReadPreferenceSettingEPSt6vectorINS_7BSONObjESaISA_EEPSA_+0x5A0) [0x55bdac5ba880]
 mongos(+0xA040A0) [0x55bdac52d0a0]
 mongos(_ZN5mongo7Command17execCommandClientEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderE+0xB5F) [0x55bdac5d34df]
 mongos(+0xA62E34) [0x55bdac58be34]
 mongos(_ZN5mongo8Strategy15clientCommandOpEPNS_16OperationContextERNS_7RequestE+0xA00) [0x55bdac58d420]
 mongos(_ZN5mongo7Request7processEPNS_16OperationContextEi+0x679) [0x55bdac586759]
 mongos(_ZN5mongo23ServiceEntryPointMongos12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE+0x1FD) [0x55bdabf4d77d]
 mongos(+0xB13CD2) [0x55bdac63ccd2]
 libpthread.so.0(+0x7DC5) [0x7f04c4cbddc5]
 libc.so.6(clone+0x6D) [0x7f04c49eac9d]
-----  END BACKTRACE  -----

Generated at Thu Feb 08 04:15:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.