[SERVER-20361] Improve the behaviour of multi-update/delete against a sharded collection Created: 10/Sep/15  Updated: 12/Dec/23

Status: Backlog
Project: Core Server
Component/s: Sharding, Write Ops
Affects Version/s: 2.6.11, 3.0.6, 3.1.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: sharding-causes-bfs-hard
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-22771 multiUpdate and multiRemove ops from ... Closed
Related
related to SERVER-22203 remove shardVersion information and u... Closed
Assigned Teams:
Cluster Scalability
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 9 (09/18/15), Sharding 10 (02/19/16)
Participants:
Case:
Linked BF Score: 19

 Description   

Multi-updates/deletes against a sharded collection are sent with shard version IGNORED against the set of all nodes which the given router thinks have chunks from that collection. Because of this, the following anomalies can happen:

  • The update/delete is not applied against all documents in the collection (when the router doesn't know chunk was donated to another shard, not currently included in the cache)
  • The update/delete is applied multiple times on different shards for the same document (because the lack of versioning prevents them from synchronizing with chunk migration, in which case they can be applied both on the donor and the recipient)

This ticket is intended to serve as the generic catch-all of these issues.



 Comments   
Comment by Esha Maharishi (Inactive) [ 08/Mar/16 ]

After the commit in SERVER-22203, on multiUpdates and multiRemoves only the OperationShardVersion is used in checkShardVersion, and the OperationShardVersion is no longer set to IGNORED by the mongod.

Instead, the mongos sends an IGNORED shard version if targeting multiple shards for a multiUpdate or multiRemove without an equality match on the shard key.

Therefore, this issue remains for non-idempotent multiUpdates with an equality match on the shard key:

  • mongos targets multiple shards for a non-idempotent multiUpdate with equality match on shard key
  • some chunks have moved, so StaleShardVersion is returned to mongos by some shard
  • mongos re-targets and re-sends the request, causing the update to be applied again on the re-targeted shards

As an aside, any time mongos targets multiple shards, it actually targets all shards. However, only shards containing the relevant data are affected by the re-applied write.

Comment by Spencer Brody (Inactive) [ 10/Sep/15 ]

I believe the fix would be to simply get the version for checkShardVersion from the OperationShardVersion/ShardedConnectionInfo rather than the parsed BatchedUpdateRequest

Generated at Thu Feb 08 03:53:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.