[SERVER-69219] Always broadcast queries using read concern level "available" to avoid missing unowned documents when filter includes equality on shard key fields Created: 28/Aug/22 Updated: 20/Apr/23 Resolved: 20/Apr/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Distributed Query Planning |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Nicholas Zolnierz |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Query Optimization
|
||||||||
| Sprint: | QE 2023-02-06, QE 2023-02-20, QE 2023-03-06, QE 2023-03-20, QE 2023-04-03 | ||||||||
| Participants: | |||||||||
| Description |
|
Read concern level "available" is used to find unowned documents in a sharded cluster. Allowing these queries be targeted to a subset of the shards is counterproductive because it means unowned documents within a particular chunk range cannot be easily searched for. One workaround is to do an aggregation pipeline with {$replaceWith: "$$ROOT"} as it'll prevent the shard targeting optimization. |
| Comments |
| Comment by Nicholas Zolnierz [ 06/Sep/22 ] |
|
Thanks Max and Andy, I've pinged christopher.harris@mongodb.com to see if this change would be useful for TS debugging as it seems like there's no internal reason for us to do this. Will throw it back into the Needs Triage and discuss with the team. |
| Comment by Max Hirschhorn [ 06/Sep/22 ] |
|
That's a good point the documentation only says "may" and so I was overzealous in calling this a bug. I'd be happy to change the title and issue type so it reads as an improvement request. Looking through Jira for tickets related to counting unowned documents, I found this comment mentioning a pattern to "Count orphans in a given chunk or shard key range". I wonder how common that is for the Support team and/or Cluster Operator to run. I don't believe the $expr equivalent of the min/max fields in the find command has the same targeting behavior of broadcasting to all shards. As I had brought up in |
| Comment by Andy Schwerin [ 05/Sep/22 ] |
|
I don't think this is particularly a priority, but also there's no real use case for today's "available" read concern. Because the targeting ignores shard versioning errors, you don't even know that the query will return all non orphan documents. A chunk of collection that migrates to a shard that did not previously own documents for that collection might not be seen in a query at read concern "available", e.g. We used to have "available" read concern by default on secondaries, but now that we do not, I do not expect any users are utilizing it. |
| Comment by Nicholas Zolnierz [ 02/Sep/22 ] |
|
max.hirschhorn@mongodb.com can you clarify the priority for this behavior? Based on the docs, we "may" return orphan docs but it doesn't appear to be a strict requirement for RC available. We could certainly avoid any shard targetting from mongos with readConcern "available", but wouldn't that hurt performance for users who don't really care about getting all orphans? |
| Comment by Max Hirschhorn [ 30/Aug/22 ] |
|
Andy noted that we would only want to consider addressing this in MongoDB 5.0+ where the default read concern level is now "local" so secondary reads outside of causally consistent sessions on older versions do not change to broadcasting. |