[SERVER-62457] Lock-free reads causes query subsystem to treat unsharded collection as sharded when collection is dropped and re-created (ABA problem) Created: 08/Jan/22 Updated: 29/Oct/23 Resolved: 10/Mar/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Catalog |
| Affects Version/s: | 5.1.1, 5.2.0-rc4 |
| Fix Version/s: | 6.0.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LFR-BUG | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Sprint: | Execution Team 2022-01-24, Execution Team 2022-02-07, Execution Team 2022-02-21, Execution Team 2022-03-07, Execution Team 2022-03-21 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Linked BF Score: | 19 | ||||||||||||||||||||||||||||
| Description |
|
This can at least lead to a server crash with slot-based execution. The query subsystem uses CollectionPtr::isSharded() to decide whether to add a plan stage to do ownership filter. As part of constructing the sbe::FilterStage to do ownership filter, the ShardFiltererImpl invariants the CollectionShardingState actually had a shard key pattern associated with it. Constructing the special-purpose ShardFilterStage used by the classic executor doesn't have this invariant and simply forwards all documents through via kUnshardedCollection. This is why I believe only slot-based execution is affected. The problematic sequence involves an unsharded collection being sharded and then dropped and then re-created as an unsharded collection:
|
| Comments |
| Comment by Githook User [ 09/Mar/22 ] |
|
Author: {'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}Message: |
| Comment by Dianna Hohensee (Inactive) [ 13/Jan/22 ] |
|
On the other hand, I still think this is a bug, where we could have the ABA problem that Max describes. AutoGet* helpers do the following 0) request comes in for an unsharded collection Musing: I suppose if an unsharded collection is queried, and it happens to be sharded and unsharded (dropped & recreated) before the mongod is locked, that can also run fine. Bit of a mystery read, but in a different way. Doesn't break the sharding protocol like lock-free does, though. |
| Comment by Dianna Hohensee (Inactive) [ 12/Jan/22 ] |
|
|
| Comment by Dianna Hohensee (Inactive) [ 12/Jan/22 ] |
|
Max pointed out that there’s an implicit shardVersion check in the access query does to the CollectionShardingState. So if the first shardVersion check is correct, then subsequent accesses with version check should also be safe. |
| Comment by Eric Cox (Inactive) [ 11/Jan/22 ] |
|
FYI geert.bosch max.hirschhorn I'm still looking into our usage of the shard key pattern and the interaction of SBE and forwarding all documents through via kUnshardedCollection. |