-
Type:
Task
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Sharding NYC
-
Fully Compatible
-
Sharding NYC 2023-02-06, Sharding NYC 2023-02-20
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
Currently, in the write phase for updateOne/deleteOne/findAndModify without shard key, mongos replaces the original filter from the client with the _id of the document that it finds in the find phase. The fact that the target shard doesn't receive the original filter is problematic for the purpose of query sampling and shard key analysis.
Consider a collection containing documents with two fields "x" and "y". The user workload involves only updateOne that filters by "y". However, they didn't know about that and sharded the collection on {x: 1}. So now every write is an updateOne without shard key. Due to the overwriting of filter, every write that a shard receives has the filter {_id: ...} and that's what gets written down to the config.sampledQueries collection. Now when the user runs analyzeShardKey to analyze the shard key {y: 1}, they would see that every write is a scatter gather and single write without shard key and conclude that the shard key is bad. Or more generally, in this scenario, all shard keys except {_id: 1} would have equally bad metrics.
Given this, the write phase should set the filter to the _id plus the original filter.
- is duplicated by
-
SERVER-73044 Preserve the original filter in the write phase for updateOne/deleteOne/findAndModify without shard key
-
- Closed
-
- is related to
-
SERVER-72084 Handle properly assigning sample ids
-
- Closed
-