[SERVER-57325] Fix bugs due to race conditions in SBE oplog plans Created: 01/Jun/21  Updated: 01/May/23  Resolved: 01/May/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Drew Paroski Assignee: Kevin Cherkauer
Resolution: Won't Do Votes: 0
Labels: sbe-post-v1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-57365 Disable oplog scans in SBE Closed
related to SERVER-54350 Investigate potential race conditions... Closed
Assigned Teams:
Query Execution
Operating System: ALL
Participants:

 Comments   
Comment by Kevin Cherkauer [ 01/May/23 ]

Support for oplog scans was determined to be out of scope for PM-3161. In consultation with andrew.paroski@mongodb.com I deleted the existing buggy SBE oplog scan stage builder code from sbe_stage_builder_coll_scan.cpp as part of SERVER-74521. That code had already been disabled for a long time by the check for nss.isOplog() in query_utils.h isQuerySbeCompatible(). oplog scans remain unsupported in SBE. If they are to be supported in future, the implementation should be redone from scratch, as the original implementation had fundamental problems.

Comment by Kevin Cherkauer [ 14/Mar/23 ]

It looks like I will need to fix this as part of enabling clustered collection scans in SBE (PM-3161). Last week I independently discovered the problem of the build-time seek embedding minRecord and maxRecord bounds into the cached plan, which causes scans to return wrong results if the same query form gets submitted again with differing bounds, and also thought the correct fix to that is to pass minRecord and maxRecord into ScanStage and do the seek at run time instead of build time (as opposed to encoding minRecord and maxRecord into the plan cache key to force a mismatch).

I found this problem when I patterened my new code implementing regular clustered collection scans on the existing oplog clustered collection scan code, not knowing at the time that the oplog scan code was known to be buggy.

I also suspect the conditions for calling the oplog scan path generateOptimizedOplogScan() in sbe_stage_builder_coll_scan.cpp generateCollScan() are not correct as they omit || csn->resumeAfterRecordId in the oplog scan builder, so this flag will not reach generateOptimizedOplogScan() in some (or maybe all) cases when it should, even though that method has code to handle it:

    if (csn->minRecord || csn->maxRecord || csn->stopApplyingFilterAfterFirstMatch) {        return generateOptimizedOplogScan( 

This problem would not have manifested either currently as oplog scans had been disabled in SBE (SERVER-57365) and have not yet been re-enabled.

Generated at Thu Feb 08 05:41:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.