[SERVER-58965] Provide an interface to access replication information Created: 30/Jul/21  Updated: 27/Oct/23  Resolved: 14/Sep/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Mindaugas Malinauskas Assignee: Matthew Russotto
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Sprint: Repl 2021-09-06, Repl 2021-09-20
Participants:

 Description   

In the User-facing Point-In-Time pre- and post-images project (PM-1944) there is a need for accessing replication related information in OpObserverImpl class in order to be able to correctly maintain pre-image documents:

  1. whether the documents are in consistent state ;

Please provide an interface that gives an access to listed above information.



 Comments   
Comment by Matthew Russotto [ 14/Sep/21 ]

Spoke to mindaugas.malinauskas, we currently believe this ticket is not necessary.

Comment by Daniel Gottlieb (Inactive) [ 26/Aug/21 ]

(edit) While what I said was true, I was misinterpreting Matthew's comment. I'm also of the understanding that having change streams (with preImages or otherwise) and tenant migrations play nicely together isn't in scope.

Tenant migrations sets "enforcing constraints" to false as well as using false for "isDataConsistent" when calling into oplog application. I don't think OpObserverImpl has any more context than oplog application.
https://github.com/mongodb/mongo/blob/ea63938df2478ad111421b847818fc64b00dedca/src/mongo/db/repl/tenant_oplog_applier.cpp#L1013-L1027

Comment by Matthew Russotto [ 26/Aug/21 ]

Tenant migration works like initial sync, in that we do oplog application on an inconsistent view of the data to bring it to a consistent view.  Difference is it does it on the primary.  The OpObserverImpl can know when it's doing oplog application in tenant migration (there's a decorator on the OpContext.  However, on the secondary that decorator won't be available.

Comment by Daniel Gottlieb (Inactive) [ 26/Aug/21 ]

I think checking isEnforcingConstraints() could work well for us here. Would that work on both primaries and secondaries? For what scenarios that does return false?

isEnforcingConstraints returns false during oplog application. It doesn't distinguish between the states of interest:

  • Applying oplog entries on a consistent set of data (steady state oplog application) – record preimages
  • Applying oplog entries during initial sync (or the now officially obsolete rollback via refetch?) – do not record preimages

Eric Milkie suggested an alternative in which only OpObserverImpl is responsible for writing pre-images depending on the context it is run

What Eric suggested is a perfectly valid choice and I can see the appeal. That said, with the retryable find and modify project to implicitly replicate/record preimages recently done, I think saving the preImages during oplog application explicitly inside oplog.cpp has the following advantages:

  • there's less typing to do overall
  • the code logic is better self-contained (based off of my guess of how things shake out, the thing in oplog.cpp that says "have this update save the preImage" is also the code that consumes the returned preImage and writes it to the change streams preImage collection).

But, if we stick with using OpObserverImpl, two ideas:

  • Thread the isDataConsistent from oplog.cpp through the UpdateRequest through OplogUpdateEntryArgs. This is maybe as simple as making sure UpdateRequest::setReturnDocs is left unset?
  • Write incorrect preimages to the collection anyways. When servicing change stream results, use something like the "initial data timestamp" that initial sync sets to determine whether a preImage should be returned (preImage timestamp <= initial data timestamp are invalid). This won't be sufficient if features like tenant migrations need to be dealt with (I'm unsure how it clones data, I just know it works around some forms of inconsistency inside oplog application).
Comment by Eric Milkie [ 25/Aug/21 ]

I think checking isEnforcingConstraints() could work well for us here. Would that work on both primaries and secondaries? For what scenarios that does return false?

Comment by Mindaugas Malinauskas [ 25/Aug/21 ]

Maybe my reference to "isDataConsistent" introduced some confusion. I'm not 100% sure only this variable encodes the state we are interested in.

In PM-1944 we would like to be able to determine if a write operation (update or delete) operates on a document version that is the same as it was when the same operation was executed on the primary node. We need this to avoid preserving an incorrect pre-image of a document when we cannot ensure correctness - one scenario when this can happen is initial sync process.

I initially worked on a design that was similar to what has been done for PM-2213 and similar to your proposal in the comment above. milkie suggested an alternative in which only OpObserverImpl is responsible for writing pre-images depending on the context it is run. To determine the context OpObserverImpl would need some means to learn about "data consistency" status.

CC: milkie, matthew.russotto

Comment by Daniel Gottlieb (Inactive) [ 02/Aug/21 ]

"isDataConsistent" means a specific thing during oplog application that (if the same words were used) wouldn't be true at the OpObserverImpl layer. Specifically, oplog application starting with consistent data at T and applying changes up to T + 10 will apply them out of order. For example, T + 9 may be applied before any of T -> T + 8. If code in OpObserverImpl asked "is data consistent" in this replication state, there are two valid answers:

  • Yes, the existing data constrained exactly to the documents being touched are correct.
  • No, causal dependencies of this write may not currently be visible.

Inside oplog.cpp, the second interpretation isn't any more true, but it is assumed anyone touching that code knows about the relaxed definition of consistency. I imagine for what PM-1944 wants, only the first criteria matters (the document being updated has all previous updates applied), which would be true for this case.

With PM-2213, we had to solve a similar problem (undoubtedly how you came across that variable inside oplog application). We instead leveraged the fact that a primary accepting writes via a findAndModify must be operating on "consistent" data and may persist a pre/post image.

Oplog application has its own codepath for writing these pre/post images.

All said, I'm a bit hesitant to exposing this state. I'd prefer for PM-1944 to explore a solution that is similar to PM-2213. Specifically:

  • A primary maintaining pre-image documents would annotate incoming commands with some "save the preimage" state on CollectionUpdateArgs/OplogDeleteEntryArgs
    • If that's too much surface area – perhaps having OpObserverImpl checking for OperationContext::isEnforcingConstraints can be used (as constraints can only be enforced if both definitions of consistency are true).
  • A secondary maintaining pre-image documents would explicitly make their writes as part of oplog application. That codepath would get into OpObserverImpl as well, but wouldn't see the state set on incoming commands (or isEnforcingConstraints() == false if that was the how things were done).
Generated at Thu Feb 08 05:45:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.