[SERVER-66088] Make resharding_disallow_writes.js robust to _shardsvrDropIndexes command retrying dropIndexes after resharding completes Created: 29/Apr/22  Updated: 29/Oct/23  Resolved: 03/May/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 6.0.0-rc5, 6.1.0-rc0

Type: Task Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Max Hirschhorn
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam1, sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Backport Requested:
v6.0
Sprint: Sharding NYC 2022-05-16
Participants:
Linked BF Score: 38
Story Points: 1

 Description   

The combination of the changes from 622b08b as part of SERVER-64464 and the changes from 77ffcb1 as part of SERVER-6491 made it possible for the dropIndexes command to time out due to maxTimeMS expiry on mongos and for the _shardsvrDropIndexes command to continue running on the primary shard. This ultimately leads to the dropIndexes command running on the recipient shard twice. The second execution will fail with IndexNotFound because the index has already been dropped.

[js_test:resharding_disallow_writes] d20272| 2022-04-26T06:57:08.901+01:00 I  COMMAND  51806   [conn53] "CMD: dropIndexes","attr":{"namespace":"test.foo","uuid":{"uuid":{"$uuid":"03b810c5-273a-4129-bb7e-6deef1411c6a"}},"indexes":"{ oldKey: 1.0 }"}
[js_test:resharding_disallow_writes] d20272| 2022-04-26T06:57:08.901+01:00 I  STORAGE  22206   [conn53] "Deferring table drop for index","attr":{"index":"oldKey_1","namespace":"test.foo","uuid":{"uuid":{"$uuid":"03b810c5-273a-4129-bb7e-6deef1411c6a"}},"ident":"index-52-8596053642237589745","commitTimestamp":{"$timestamp":{"t":1650952628,"i":75}}}
[js_test:resharding_disallow_writes] d20273| 2022-04-26T06:57:08.902+01:00 I  COMMAND  20344   [ReplWriterWorker-1] "CMD: dropIndexes","attr":{"namespace":"test.foo","indexes":"\"oldKey_1\""}
...
[js_test:resharding_disallow_writes] d20272| 2022-04-26T06:57:09.382+01:00 I  COMMAND  51806   [conn53] "CMD: dropIndexes","attr":{"namespace":"test.foo","uuid":{"uuid":{"$uuid":"03b810c5-273a-4129-bb7e-6deef1411c6a"}},"indexes":"{ oldKey: 1.0 }"}

https://evergreen.mongodb.com/lobster/build/134933b7e47bc91e2ab6481453bb0d2c/test/6267897c904130343b24fef1#bookmarks=0%2C4871%2C4981%2C9164&f~=000~cmd%3A%20dropIndexes&l=1

We should change the resharding_disallow_writes.js test so it drops a different index after the resharding operation completes to avoid the IndexNotFound error.



 Comments   
Comment by Githook User [ 03/May/22 ]

Author:

{'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}

Message: SERVER-66088 Create separate indexes for resharding test to drop.

Due to a StaleConfig exception, the _shardsvrDropIndexes command may
automatically retry the dropIndexes command after the resharding
operation has succeeded and lead to a spurious IndexNotFound error.
While the _shardsvrDropIndexes and dropIndexes shard commands from the
resharding_disallow_writes.js test have a maxTimeMSOpOnly attached, on
platforms with less precise clocks (namely Windows), there may still be
sufficient time for a retry to go through from the primary shard even
after the initial dropIndexes request had timed out on mongos.

(cherry picked from commit 73938f0e9a74b4e45b7b134473ad8e3b36322a66)
Branch: v6.0
https://github.com/mongodb/mongo/commit/fbb55736291c98f936ed01549486e7cfec1009da

Comment by Githook User [ 03/May/22 ]

Author:

{'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}

Message: SERVER-66088 Create separate indexes for resharding test to drop.

Due to a StaleConfig exception, the _shardsvrDropIndexes command may
automatically retry the dropIndexes command after the resharding
operation has succeeded and lead to a spurious IndexNotFound error.
While the _shardsvrDropIndexes and dropIndexes shard commands from the
resharding_disallow_writes.js test have a maxTimeMSOpOnly attached, on
platforms with less precise clocks (namely Windows), there may still be
sufficient time for a retry to go through from the primary shard even
after the initial dropIndexes request had timed out on mongos.
Branch: master
https://github.com/mongodb/mongo/commit/73938f0e9a74b4e45b7b134473ad8e3b36322a66

Generated at Thu Feb 08 06:04:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.