[SERVER-34971] Improve mongoS targeting of replacement-style updates for collections whose shard key includes _id Created: 13/May/18  Updated: 29/Oct/23  Resolved: 11/Jun/18

Status: Closed
Project: Core Server
Component/s: Querying, Sharding
Affects Version/s: None
Fix Version/s: 4.1.1

Type: Improvement Priority: Major - P3
Reporter: Bernard Gorman Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-13010 Sharded upsert incorrectly errors if ... Closed
Related
is related to SERVER-30970 Don't allow single-updates that aren'... Backlog
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6
Sprint: Query 2018-05-21, Query 2018-06-04, Query 2018-06-18
Participants:
Case:
Linked BF Score: 3

 Description   

Replacement-style updates on a sharded cluster are targeted by mongoS on the basis of the replacement document's shard key values rather than the query component, with some relaxed constraints for updates whose queries contain an exact match on _id. Consider a collection sharded on _id, or _id plus any number of additional fields. Then, replacement-style updates of the form shown below are legal, despite the fact that the replacement document does not contain the entire shard key:

db.collection.update({_id: x}, {a: y, b: z}) 

This is because mongoD always automatically propagates the _id field of the existing document into the replacement document. The operation above will therefore succeed, assuming that the replacement document contains all other fields in the shard key and their values match those in the existing document.

Similarly, it should also be legal to perform the above operation with upsert:true, since mongoD will extract the _id from the query component when generating the new document to upsert.

However, there are a few shortcomings with the current targeting logic:

  • The non-upsert replacement will succeed, but because the replacement document does not contain the entire shard key, the operation will be scattered to all shards that own chunks for the collection. This is unnecessary; the update should target a single shard.
  • Because updates which target multiple endpoints are dispatched with ChunkVersion::IGNORED(), scattering will update any orphaned documents present in the cluster, as well as the cloned documents that are temporarily present on both the source and destination shards while a chunk migration is in flight. This leads to the unintuitive situation where a multi:false update with an exact match on _id returns nMatched and nModified greater than 1.
  • Attempting this operation with upsert:true will fail, because the targeting logic requires an exact shard key match but only considers the replacement document. Since the _id is available in the request, this upsert should be permitted.

Finally, multi:true operations of the form shown above, or which target a range of _ids, will also succeed but must again scatter to all shards. These should target only the relevant subset of shards.
Note: after further discussion, it was determined that this improvement is not feasible at present. Targeting more than one shard obliges us to target all shards with unversioned updates.
 
To address these shortcomings, we should merge the replacement document into the query component as a set of additional constraints, and target on the basis of the resulting composite query.



 Comments   
Comment by Bernard Gorman [ 11/Jun/18 ]

To summarise the changes introduced by this patch:

  • Single-doc updates with an inexact match on the shard key are now permitted to run iff all chunks referenced by the update reside on a single shard.
  • For replacement-style updates, mongoS will validate that the replacement doc contains all shard key fields before dispatching the operation.
    • The replacement document is not required to possess the _id field if it is present in the shard key, since mongod will automatically propagate _id from the document being replaced.
    • If the shard key contains _id and the only missing shard key field in the replacement doc is _id, then mongoS will attempt to extract its value from an exact match in the query component of the update, if such a predicate exists. This will allow the operation to be targeted directly by shard key.
  • All other behaviours, including the ability to perform single updates by exact _id match with the collection default collation, remain unchanged.
Comment by Githook User [ 11/Jun/18 ]

Author:

{'username': 'gormanb', 'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com'}

Message: SERVER-34971 Improve mongoS targeting for single-shard updates, and for replacement-style updates when the shard key includes _id
Branch: master
https://github.com/mongodb/mongo/commit/0dd1fc7ddde2a489558f5328dce5125bddfb9e4d

Generated at Thu Feb 08 04:38:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.