[SERVER-67653] Resharding coordinator can incorrectly conclude that it can start the critical section although on one recipient the oplog applier hasn't caught up with the oplog fetcher Created: 29/Jun/22  Updated: 29/Oct/23  Resolved: 24/Aug/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.13, 6.0.2, 6.1.0-rc3, 6.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Cheahuychou Mao Assignee: Andrew Witten (Inactive)
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam2
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-67650 Resharding recipient can return remai... Closed
is related to SERVER-68783 Recipient shard may incorrectly retur... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.1, v6.0, v5.0
Participants:
Story Points: 3

 Description   

The resharding coordinator queries all recipients for an estimation of the remaining time for the active resharding operation of participant shards in CoordinatorCommitMonitor::queryRemainingOperationTimeForRecipients using command _shardsvrReshardingOperationTime.

 

The recipient shards handle that command here.  The function ReshardingMetrics::getOperationRemainingTime could possibly return boost::none.  In that case, the "recipientMillis" field of the recipient's response to the coordinator will be omitted.  

 

If all participants were to omit this field, then these if statements won't be entered and when the participant reads the max remaining time (here) remainingTimes.max would still be 0 and remainingTimes.min would still be Milliseconds::max (as they were initialized).  In particular, the invariant mentioned here would fail.

 

The effect of this, is that a recipient returning an empty "remainingMillis" field is equivalent to it returning "remainingMillis: 0."  This is a bug in at least one case: where _shardsvrReshardingOperationTime is run against a recipient shard before the recipient shard has restored its metrics (during a step up).

 

As a result, the coordinator would, believing that recipientMillis was under the threshold for all recipients, prematurely begin the critical section, and the resharding operation would fail with ReshardingCriticalSectionTimeout if the recipient above doesn't manage to enter the "strict-consistency" state within the timeout. 



 Comments   
Comment by Githook User [ 20/Sep/22 ]

Author:

{'name': 'Andrew Witten', 'email': 'andrew.witten@mongodb.com', 'username': 'awitten1'}

Message: SERVER-67653 don't initiate critical section if remainingMillis is omitted

(cherry picked from commit abd6330d793235c8fbc5de7bf3ec53855ebea9d3)

SERVER-69693 use lambda instead of repeating code

(cherry picked from commit dd9d37fa07a7af46ac4ee8dabeb235dc9571afb0)
Branch: v6.1
https://github.com/mongodb/mongo/commit/d5004b929a08ee46f9803b62a8f1239e3cc3f9bd

Comment by Githook User [ 14/Sep/22 ]

Author:

{'name': 'Andrew Witten', 'email': 'andrew.witten@mongodb.com', 'username': 'awitten1'}

Message: SERVER-67653 don't initiate critical section if remainingMillis is omitted

(cherry picked from commit abd6330d793235c8fbc5de7bf3ec53855ebea9d3)

SERVER-69693 Use lambda instead of repeating code in resharding coordinator
Branch: v6.0
https://github.com/mongodb/mongo/commit/1db7b70c4ff5fd88fd4922350c65b09805e19710

Comment by Githook User [ 14/Sep/22 ]

Author:

{'name': 'Andrew Witten', 'email': 'andrew.witten@mongodb.com', 'username': 'awitten1'}

Message: SERVER-67653 don't initiate critical section if remainingMillis is omitted

(cherry picked from commit abd6330d793235c8fbc5de7bf3ec53855ebea9d3)

SERVER-69693 Use lambda instead of repeating code in resharding coordinator
Branch: v5.0
https://github.com/mongodb/mongo/commit/2eebadf82faa58ac5aebc9e8238c7b299e316a7d

Comment by Max Hirschhorn [ 01/Sep/22 ]

Author:

{'name': 'Andrew Witten', 'email': 'andrew.witten@mongodb.com', 'username': 'awitten1'}

Message: SERVER-67653 don't initiate critical section if remainingMillis is omitted
Branch: master
https://github.com/mongodb/mongo/commit/abd6330d793235c8fbc5de7bf3ec53855ebea9d3

Comment by Andrew Witten (Inactive) [ 10/Aug/22 ]

I am noticing this however, when I add the following invariant:

 

diff --git a/src/mongo/db/s/resharding/resharding_coordinator_commit_monitor.cpp b/src/mongo/db/s/resharding/resharding_coordinator_commit_monitor.cpp
index 374dcd6538f..af66228ccba 100644
--- a/src/mongo/db/s/resharding/resharding_coordinator_commit_monitor.cpp
+++ b/src/mongo/db/s/resharding/resharding_coordinator_commit_monitor.cpp
@@ -206,6 +206,8 @@ ExecutorFuture<void> CoordinatorCommitMonitor::_makeFuture() const {
             return RemainingOperationTimes{Milliseconds(0), Milliseconds::max()};
         })
         .then([this, anchor = shared_from_this()](RemainingOperationTimes remainingTimes) {
+            invariant(remainingTimes.min <= remainingTimes.max);
+
             auto metrics = ReshardingMetrics::get(cc().getServiceContext());
             metrics->setMinRemainingOperationTime(remainingTimes.min);
             metrics->setMaxRemainingOperationTime(remainingTimes.max); 

We do hit this invariant, meaning there are cases where all recipients are hitting this return when handling command "_shardsvrReshardingOperationTime."  In the case where we would hit this invariant the coordinator would be misled into thinking we can begin the critical section (that's because remainingTime.max is initialized here to 0, and if remainingTime.max < remainingTime.min is true, then remainingTime.max hasn't been assigned to since it was initialized).  I'm wondering if SERVER-67650 fixes this issue.

 

Comment by Andrew Witten (Inactive) [ 09/Aug/22 ]

One thing about this confuses me:  when I look at the code for handling the command "_shardsvrReshardingOperationTime" I see this which then calls this and computes "remainingMillis" with a direct call remainingOperationTime directly, which never returns -1.

Generated at Thu Feb 08 06:08:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.