[SERVER-43412] Create a wrapper around getCursor calls for the oplog collection that ensures oplog visibility rules are enforced Created: 23/Sep/19  Updated: 06/Dec/22  Resolved: 02/Apr/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
is caused by SERVER-30771 investigate oplog stones initialization Closed
Assigned Teams:
Storage Execution
Participants:

 Description   

Motivation: enforce oplog visibility rules. An oplog collection read should in most cases be the first operation in a WT transaction, so as to select a snapshot using the _oplogVisibleTs and not see any oplog holes. Currently, getCursor() will call setIsOplogReader() for oplog reads, so the the _oplogVisibleTs is used when opening a WT txn.
-----------------------------------

SERVER-30771 discovered two difficult issues with adding an invariant (!inActiveTxn() || _isOplogReader) to the setIsOplogReader() function. The first is that getCursor() has a MODE_X lock bypass where we throw the (!inActiveTxn || _isOplogReader) rules out the window. Secondly, that, particularly around oplog truncation, there are multiple use cases where we should see the oplog in its entirety without visibility rules: for example, _truncateOplogTo fetches a reverse oplog cursor without visibility restrictions, starting a txn, then calls capperTruncateAfter, which opens a forward oplog cursor with visibility restrictions.

Therefore, we're proposing trying to create a wrapper function around getCursor() calls against the oplog collection, say getOplogCursor(). This new function could take a boolean for visibility rules: the default is to obey the visibility rule; but it can be explicitly overridden by callers we do not wish to follow the restrictions.

Given how many callers there are of getCursor(), it may be simpler to do something like this

getCursor() {
    if (_nss == oplog) {
        getOplogCursor();
    }
    .....
}

And swap oplog cursor only codepaths using getCursor() with getOplogCursor(/*ignoreOplogVisibilityRule*/ true) (if they ignore the visibility rules). However, there are also callers like in the createCollection code path, where we would need to add special

{    
    if (nss != oplog) {
        getCursor();
    } else { 
        getOplogCursor(/\*ignoreOplogVisibilityRule\*/ true);
    }
} 

So maybe getCursor() should have a visibility boolean, too.

May need to elevate the setIsOplogReader() and getIsOplogReader() functions from the WiredTigerRecoveryUnit to the RecoveryUnit interface, so higher level callers external to WT can call them – would need a new unsetIsOplogReader() function, too, if this implementation is desired. SERVER-30771 has a WIP that reflects this: SERVER-30771's solution, to invariant (!inActiveTxn() || _isOplogReader) was abandoned because there appear to be too many special cases where we do not obey the visibility rules and would have to litter special calls to setIsOplogReader() in many code paths prior to any WT txn starting on a different collection.


Generated at Thu Feb 08 05:03:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.