[SERVER-72623] Add a server comment in tenant oplog applier for shard merge oplog application mode & isDataConsistent true value. Created: 09/Jan/23  Updated: 29/Oct/23  Resolved: 08/Mar/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Suganthi Mani
Resolution: Fixed Votes: 0
Labels: shard-merge-milestone-3
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-72622 Resuming tenant oplog applier due to ... Closed
Assigned Teams:
Serverless
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Server Serverless 2023-03-20
Participants:

 Description   

When a recipient failover happens in the middle of tenant oplog application, we might resume applying those donor oplog entries from a point that was already applied due to the fact we first apply the donor oplog entries and then write no-op donor oplog entries for the given batch and we calculate the resume point by finding the latest no-op oplog entry . So, with the stricter mode 'kSecondary', re-applying those donor oplog entries can cause failure and leads to shard merge failure unnecessarily. So, change the tenant oplog application mode to kInitialSync (lenient mode)



 Comments   
Comment by Githook User [ 08/Mar/23 ]

Author:

{'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com', 'username': 'smani87'}

Message: SERVER-72623 Update the comment in tenant oplog applier about the data consistency.
Branch: master
https://github.com/mongodb/mongo/commit/b10e48f30d9e5833c4103cfa368dd1b907c389dd

Comment by Suganthi Mani [ 03/Mar/23 ]

Shard Merge is not robust to donor/recipient failovers, restarts and rollbacks (see the design doc).

No fix is required, just to give clarity to future readers, add a comment here on why it's ok to have isDataConsistent to true.

Comment by Suganthi Mani [ 01/Feb/23 ]

Note: As part of the ticket, `isDataConsistent` should be set to false as data can be inconsistent on recipient failover during oplog catchup phase where we apply oplog entries in parallel. The side-effect of setting `isDataConsistent` is true is that we might record pre/post images of change streams and rFAM wrongly (See SERVER-69001).

Generated at Thu Feb 08 06:22:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.