[SERVER-46040] moveChunk in multi_stmt_txn_jscore_passthrough_with_migration can cause failures when mongos retries commands that are not idempotent Created: 07/Feb/20  Updated: 29/Oct/23  Resolved: 02/Apr/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.4.0-rc2, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Blake Oler
Resolution: Fixed Votes: 0
Labels: sharding-4.4-stabilization
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-50337 Complete TODO listed in SERVER-46040 Backlog
related to SERVER-47273 Remove blacklist for fixed test Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Sharding 2020-03-09, Sharding 2020-03-23, Sharding 2020-04-06
Participants:
Linked BF Score: 9

 Description   

An example scenario with dropIndex with 2+ shards:

1. test sends dropIndex to mongos.
2. dropIndex successfully runs in shard0.
3. dropIndex returns StaleConfig error in shard1 because of migration.
4. Mongos retries dropIndex but got index does not exist error because it is already dropped in shard0.



 Comments   
Comment by Githook User [ 13/Apr/20 ]

Author:

{'name': 'Blake Oler', 'email': 'blake.oler@mongodb.com', 'username': 'BlakeIsBlake'}

Message: SERVER-46040 Maintain retry state across stale config retries for sharded drop indexes

(cherry picked from commit 5403488f656db357ce123f78cf25aa63a9e5aff8)
Branch: v4.4
https://github.com/mongodb/mongo/commit/f7f5c9868178b8b8b8a930446beca446d2b3407d

Comment by Githook User [ 02/Apr/20 ]

Author:

{'name': 'Blake Oler', 'email': 'blake.oler@mongodb.com', 'username': 'BlakeIsBlake'}

Message: SERVER-46040 Maintain retry state across stale config retries for sharded drop indexes
Branch: master
https://github.com/mongodb/mongo/commit/5403488f656db357ce123f78cf25aa63a9e5aff8

Comment by Jack Mulrow [ 10/Mar/20 ]

dropIndexes on mongos already considers IndexNotFound an ignorable error when aggregating shard responses, but the problem here is the helper dropIndexes uses will not ignore an ignorable error if it's reported by every shard, like in this case after the StaleConfig retry.

One way to fix this could be to make mongos dropIndexes somehow "remember" if a shard successfully dropped the index across StaleConfig retries, so it knows it's safe to ignore IndexNotFound even if every shard returns it on a later attempt.

Generated at Thu Feb 08 05:10:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.