[SERVER-73032] SBE plan cache prevents range deletion from running for rooted $or queries Created: 19/Jan/23  Updated: 29/Oct/23  Resolved: 26/Jan/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: pm2697-63, pm2697-m3
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File repro-server-73032.patch    
Issue Links:
Depends
is depended on by SERVER-43099 Reenable random chunk migration failp... Closed
Related
related to SERVER-61835 Fix how SBE plan cache deals with Sha... Closed
Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

repro-server-73032.patch

Run jstests/sharding/sbe_plan_cache_does_not_block_range_deletion.js on the sharding suite with 'featureFlagSbeFull' enabled.

Sprint: QE 2023-02-06
Participants:

 Description   

SERVER-61835 fixed a situation where the SBE plan cache improperly caches a ShardFilter, which prevents range deletions from running. It seems like for some query shapes this is still happening. The query shape in particular I found is

{$or: [{a: {$lte: 10}}, {b: {$lte: 10}}]}

 



 Comments   
Comment by Githook User [ 26/Jan/23 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-73032 Fix SBE subplanner to avoid caching ShardFilterer
Branch: master
https://github.com/mongodb/mongo/commit/3c1f515464cc63ceeb73946cfd61997676697925

Comment by David Storch [ 24/Jan/23 ]

This WIP branch has a working fix: https://github.com/dstorch/mongo/tree/SERVER-73032. I hope to add a test case and put it up for code review tomorrow.

Comment by David Storch [ 24/Jan/23 ]

Thanks for reporting this issue jordi.serra-torrens@mongodb.com! I can reproduce the bug as described. The intended design is that any plan which is stored in the cache consists of clones made of the sbe::PlanStage tree and the PlanStageData before the query is prepared or executed. This ensures that the SBE plan and its associated RuntimeEnvironment are in a clean state. In particular in this case, it means that the clone of the RuntimeEnvironment is made before it is populated with the shard filterer, thus avoiding a situation where we are accidentally caching the shard filter and preventing range deletion.

Since the problematic query in question is a rooted-$or, it seems that the subplanning path does not conform to the design described above. As a next step, I'm going to re-read the code for SBE subplanning to see if I can spot the issue.

Generated at Thu Feb 08 06:23:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.