[SERVER-47169] Sharding initialization contacts config shard before ShardRegistry updated by RSM, preventing mongos from starting up Created: 28/Mar/20  Updated: 29/Oct/23  Resolved: 01/Apr/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.4.0-rc0, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Haley Connelly
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
is caused by SERVER-47029 Fix race when streamable RSM updates ... Closed
Related
is related to SERVER-50997 Make ShardRegistry::updateReplSetHost... Closed
is related to SERVER-43985 Make mongos pre-cache the routing tab... Closed
is related to SERVER-44152 Pre-warm connection pools in mongos Closed
is related to SERVER-39818 Split RSM notification functionality ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Steps To Reproduce:

Apply the following patch to have mongos delay updating the ShardRegistry after the listener is notified about a confirmed replica set.

python buildscripts/resmoke.py --suite=sharding repro_mongos_fails_during_sharding_initialization.js

diff --git a/repro_mongos_fails_during_sharding_initialization.js b/repro_mongos_fails_during_sharding_initialization.js
new file mode 100644
index 0000000000..4c2429b244
--- /dev/null
+++ b/repro_mongos_fails_during_sharding_initialization.js
@@ -0,0 +1,14 @@
+(function() {
+'use strict';
+
+var st = new ShardingTest({config: 3, mongos: 1, shards: 0});
+
+// XXX: Even with the artificial delay in ShardingReplicaSetChangeListener::onConfirmedSet(), the
+// issue only seems to manifest in mongos about half the time. We restart the mongos process a few
+// times to make the failure more apparent.
+for (let i = 0; i < 10; ++i) {
+    st.restartMongos(0);
+}
+
+st.stop();
+}());
diff --git a/src/mongo/s/server.cpp b/src/mongo/s/server.cpp
index 34bb98f4d9..39ff0c436f 100644
--- a/src/mongo/s/server.cpp
+++ b/src/mongo/s/server.cpp
@@ -491,6 +491,12 @@ public:
             invariant(args.status);
 
             try {
+                LOGV2(2284600,
+                      "Sleeping before updating sharding state with confirmed set {connStr}",
+                      "connStr"_attr = connStr);
+
+                sleepmillis(10000);
+
                 LOGV2(22846,
                       "Updating sharding state with confirmed set {connStr}",
                       "connStr"_attr = connStr);

Sprint: Sharding 2020-04-06
Participants:
Linked BF Score: 27

 Description   

The ShardingNetworkConnectionHook causes a ShardNotFound error status to be returned if the HostAndPort isn't found in the ShardRegistry. This hook is run after a connection to the remote host has been established.

Status ShardingNetworkConnectionHook::validateHostImpl(
    const HostAndPort& remoteHost, const executor::RemoteCommandResponse& isMasterReply) {
    auto shard =
        Grid::get(getGlobalServiceContext())->shardRegistry()->getShardForHostNoReload(remoteHost);
    if (!shard) {
        return {ErrorCodes::ShardNotFound,
                str::stream() << "No shard found for host: " << remoteHost.toString()};
    }
 
    ...
}

The connection string for config shard may be updated while the sharding subsystem is initializing. (For reasons I still don't quite understand, this doesn't happen every time mongos is started, but I believe it is a necessary condition for the issue reported here to manifest.) Updating the connection string upon receiving isMaster responses from secondaries of the config shard (where the primary is still seen by the RSM as "Unknown") would remove the HostAndPort for the primary from ShardRegistry::_hostLookup. Re-adding the HostAndPort for the primary to ShardRegistry::_hostLookup happens as part of ShardingReplicaSetChangeListener::onConfirmedSet() by scheduling a task on the fixed executor. Since the ShardRegistry::_hostLookup map isn't updated synchronously, it is possible for the RSM to view the now-confirmed primary as being available for targeting primary-only reads, but for the post-connection established validate hook to fail. This leads to mongos being unable to start up successfully.



 Comments   
Comment by Githook User [ 02/Apr/20 ]

Author:

{'name': 'Haley Connelly', 'email': 'haley.connelly@mongodb.com', 'username': 'haleyConnelly'}

Message: SERVER-47169 Call ShardRegistry::updateReplSetHosts() synchronously

(cherry picked from commit 08351c1b12f3ca5c9ab99b6628e27d2083278011)
Branch: v4.4
https://github.com/mongodb/mongo/commit/ed3a75b6f792dc8f93540622305caa3ea35ad389

Comment by Githook User [ 01/Apr/20 ]

Author:

{'name': 'Haley Connelly', 'email': 'haley.connelly@mongodb.com', 'username': 'haleyConnelly'}

Message: SERVER-47169 Call ShardRegistry::updateReplSetHosts() synchronously
Branch: master
https://github.com/mongodb/mongo/commit/08351c1b12f3ca5c9ab99b6628e27d2083278011

Comment by Max Hirschhorn [ 30/Mar/20 ]

Lamont, Janna, Haley, and I discussed this issue today. The current plan is to try and change ShardingReplicaSetChangeListener::onConfirmedSet() in both mongod and mongos so that ShardRegistry::updateReplSetHosts() is called synchronously, i.e. outside the task being scheduled on the fixed executor. (We'll still want to schedule a separate task for updating the contents of the config database.) Doing so would ensure that if getHostOrRefresh() would resolve to a HostAndPort, then the ShardRegistry won't error after connecting to it as a result of ShardingNetworkConnectionHook::validateHostImpl().

Comment by Max Hirschhorn [ 28/Mar/20 ]

Based on some of the additional context I added to these error statuses, it appears pre-caching the routing table in mongos on startup (SERVER-43985) and pre-warming connection pools in mongos on startup (SERVER-44152) are responsible for the behavioral difference of sharding initialization in mongos contacting the config shard before the ShardRegistry has been updated by the replica set monitor.

I've tentatively marked this as an RC0 blocker because I feel it is something the automation team is likely to run into.

Generated at Thu Feb 08 05:13:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.