[SERVER-56763] Validate collection epoch when not holding a DB lock for $merge Created: 07/May/21 Updated: 29/Oct/23 Resolved: 15/Jul/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.3, 5.1.0-rc0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Eric Cox (Inactive) | Assignee: | Nicholas Zolnierz |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Backport Requested: |
v5.0
|
||||||||||||||||
| Sprint: | Query Optimization 2021-06-14, Query Optimization 2021-06-28, Query Optimization 2021-07-12, Query Optimization 2021-07-26 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 43 | ||||||||||||||||
| Description |
|
When fixing
not under a DB lock right before execution of the query on the leaf nodes of the merge topology. Why would this be better? The short answer:
Which avoids the case where the leaf nodes that do the data reads know about a dropped collection and the mongos doesn't at the time it sends the targetCollectionVersion to the mongod acting as the router. Note that there could still be a pathological case where the merge topology has 2 leaf nodes and one is reached much earlier than the second, and the first one processes Petabytes of data when the collection is dropped on the second leaf. The only theoretical way to get around this is to probably open cursors on all shards that will participate in the merge plan, but that would be possibly infeasible. |
| Comments |
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] |
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! |
| Comment by Githook User [ 04/Aug/21 ] |
|
Author: {'name': 'Nick Zolnierz', 'email': 'nicholas.zolnierz@mongodb.com', 'username': 'nzolnierzmdb'}Message: This allows us to consult and refresh the catalog cache without holding locks (cherry picked from commit 23ecc48f89f4ec03d7b42e637c5969802efdb261) |
| Comment by Kaloian Manassiev [ 17/Jul/21 ] |
|
I am not familiar with what is the purpose of the collection Epoch change, but wanted to clarify that any version/epoch changes without lock performed by a node that's acting as a shard (i.e., returning some data) are incorrect because nothing guarantees that this epoch will not change immediately afterwards. Furthermore, the check in this change is done from the cache which assumes it is done while acting as a router. Which is the role of the code which executes this check (i.e., will it return any data locally or is it just passing it on to another node and prepared to receive a StaleShardVersion) ? |
| Comment by Githook User [ 15/Jul/21 ] |
|
Author: {'name': 'Nick Zolnierz', 'email': 'nicholas.zolnierz@mongodb.com', 'username': 'nzolnierzmdb'}Message: This allows us to consult and refresh the catalog cache without holding locks |