[SERVER-48699] MaxTimeMS may expire in range_deleter_interacts_correctly_with_refine_shard_key.js test before _configsvrMoveChunk command started Created: 10/Jun/20  Updated: 29/Oct/23  Resolved: 30/Jun/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.7.0, 4.4.2

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Marcos José Grillo Ramirez
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-48153 Chunk migration can still be running ... Backlog
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Sharding 2020-06-29
Participants:
Linked BF Score: 11

 Description   

The range_deleter_interacts_correctly_with_refine_shard_key.js test tries to start a chunk migration without waiting for it to complete by setting the maxTimeMS for the command to 1 second.

Using a higher maxTimeMS would make the test take longer but would be more robust to other parts of the moveChunk command taking longer. We could also consider introducing a failpoint to lower the maxTimeMS only before configsvr_client::moveChunk() is called.

diff --git a/src/mongo/s/commands/cluster_move_chunk_cmd.cpp b/src/mongo/s/commands/cluster_move_chunk_cmd.cpp
index 398d8fc49c..da899c2550 100644
--- a/src/mongo/s/commands/cluster_move_chunk_cmd.cpp
+++ b/src/mongo/s/commands/cluster_move_chunk_cmd.cpp
@@ -46,11 +46,14 @@
 #include "mongo/s/config_server_client.h"
 #include "mongo/s/grid.h"
 #include "mongo/s/request_types/migration_secondary_throttle_options.h"
+#include "mongo/util/fail_point.h"
 #include "mongo/util/timer.h"
 
 namespace mongo {
 namespace {
 
+MONGO_FAIL_POINT_DEFINE(startMoveChunkWithoutWaiting);
+
 class MoveChunkCmd : public ErrmsgCommandDeprecated {
 public:
     MoveChunkCmd() : ErrmsgCommandDeprecated("moveChunk", "movechunk") {}
@@ -190,6 +193,9 @@ public:
         chunkType.setShard(chunk->getShardId());
         chunkType.setVersion(cm->getVersion());
 
+        if (MONGO_unlikely(startMoveChunkWithoutWaiting.shouldFail())) {
+            opCtx->setDeadlineAfterNowBy(Microseconds(1), ErrorCodes::MaxTimeMSExpired);
+        }
         uassertStatusOK(configsvr_client::moveChunk(opCtx,
                                                     chunkType,
                                                     to->getId(),



 Comments   
Comment by Githook User [ 19/Oct/20 ]

Author:

{'name': 'Marcos José Grillo Ramírez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-48699 Changed refine key test to prevent premature timeouts on slow machines, fixed second test with FCV 4.2
Branch: v4.4
https://github.com/mongodb/mongo/commit/91d24dd5053410c603c943b5765ebb2f1ba8ad4c

Comment by Githook User [ 04/Aug/20 ]

Author:

{'name': 'Marcos José Grillo Ramírez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-48699 Changed refine key test to prevent premature timeouts on slow machines

(cherry picked from commit 54cc38653d534ae51c6153e04d52ee7c58aa5b6e)
Branch: v4.4
https://github.com/mongodb/mongo/commit/d6bee54ac30c8c8a7f4cb3ceac56d092c5dd37d9

Comment by Githook User [ 30/Jun/20 ]

Author:

{'name': 'Marcos José Grillo Ramírez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-48699 Changed refine key test to prevent premature timeouts on slow machines
Branch: master
https://github.com/mongodb/mongo/commit/54cc38653d534ae51c6153e04d52ee7c58aa5b6e

Generated at Thu Feb 08 05:17:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.