[SERVER-80791] Potential data consistency issue with implicitly replicated collections Created: 06/Sep/23  Updated: 10/Jan/24  Resolved: 10/Jan/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jordi Olivares Provencio Assignee: Wei Hu
Resolution: Gone away Votes: 0
Labels: repl-shortlist
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-84259 Cleaner interface for internal collec... Backlog
is related to SERVER-67962 Applying config.image_collection dele... Closed
Assigned Teams:
Replication
Operating System: ALL
Sprint: Execution Team 2023-12-11, Execution Team 2023-12-25, Execution Team 2024-01-08, Execution Team 2024-01-22
Participants:
Linked BF Score: 5

 Description   

Implicitly replicated collections are a set of internal collections that participate in oplog replication partially by only replicating deletes. Inserts to these collections are implicit by other writes to potentially different collections.

This behaviour however, can result in inconsistency during secondary oplog application. Suppose the following scenario:

  • T1 writes to collA, which implicitly inserts to impA (implicitly replicated)
  • T2 deletes the insert performed to impA

In this case both T2 and T1 will be executed in parallel during oplog replication on the secondary if they are part of the same batch. This might lead to the delete operation being performed before the insert. As a result, the document will not exist on the primary but survive on the secondary.



 Comments   
Comment by Judah Schvimer [ 11/Sep/23 ]

This appears to be a generalization of the problem of SERVER-67962.

Comment by Jordi Olivares Provencio [ 06/Sep/23 ]

One potential solution could be to detect that an oplog entry is done on an implicitly replicated collection and process it individually on its own batch. We already seem to be doing something like this for some namespaces that make NamespaceString::mustBeAppliedInOwnOplogBatch return true here.

Generated at Thu Feb 08 06:44:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.