[SERVER-72622] Resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries. Created: 09/Jan/23 Updated: 15/Jun/23 Resolved: 15/Jun/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | Christopher Caplinger |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Sprint: | Server Serverless 2023-02-06, Server Serverless 2023-02-20, Server Serverless 2023-03-06, Server Serverless 2023-03-20, Server Serverless 2023-04-03, Server Serverless 2023-04-17, Server Serverless 2023-05-01 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 173 | ||||||||||||||||||||
| Description |
|
Tenant oplog applier first apply writes and then write no-ops for each donor oplog entries in a given oplog batch and then these no-ops are written in parallel using the writerpool threads. And to calculate the resume point on recipient failover, we traverse backwards through the oplog collection and find the most recent no-op oplog entry from the current migration. Due to this code logic, resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries. The consequence of this would be |
| Comments |
| Comment by Didier Nadeau [ 17/Apr/23 ] |
|
Due to the complexity of the fix and the side-effects faced while working on it, we decided to abort Tenant Migration in case of recipient failovers |
| Comment by Githook User [ 10/Apr/23 ] |
|
Author: {'name': 'Christopher Caplinger', 'email': 'christopher.caplinger@mongodb.com', 'username': 'UnicodeSnowman'}Message: Revert " This reverts commit 636ad08bdb6fb5c9ad98ee4cdde8f52929c29830. |
| Comment by Githook User [ 08/Apr/23 ] |
|
Author: {'name': 'Christopher Caplinger', 'email': 'christopher.caplinger@mongodb.com', 'username': 'UnicodeSnowman'}Message: |
| Comment by Githook User [ 28/Mar/23 ] |
|
Author: {'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com', 'username': 'smani87'}Message: Revert " This reverts commit 3c130a69eaddc7cb44895f57af4da6e39556dbb4. |
| Comment by Suganthi Mani [ 28/Mar/23 ] |
|
Reverting the commit due to build failure. Further inspecting this commit changes, it can lead to potential deadlocks and perf regression. So, will revert the commit and rework on the changes to address those problems. Writing down the problems in detail with the commit 1) The root cause of the BF is that with the new change, it is possible to reapply retryable writes and transaction donor oplog entries on recipient failovers with a transaction number older than the current active transaction number for a given session. We handled it for statementIds but we didn't handle it for transaction numbers already executed case. As a result, the test was throwing ErrorCodes::TransactionTooOld. 2) Potential places where we can get into 2-way deadlock between stepdown and TenantOplogApplier startup are, this [1] and this [2] line in the tenant oplog applier code (see 3) Now with the changes, we are checking out the session which reads oplog from disk to load retryable chain and this session gets invalidated at the end of each batch. This means, every time a batch should read from disk to load the retryable write oplog chain while checking out session for a given session. This would lead to potential perf regression in tenant migration. Additionally, I noted a minor change that on FCV 6.3 and Mongodb binary, we scan oplog backwards to find the resume opTime even without recipient failovers. This is a change in behavior from the original where we try to find resume opTime only when the resume phase is `ResumePhase::kOplogCatchup` (Update: Uploaded a diff patch to address problem 1 & 3 with base hash commit as 00e9ca719a7ef980e756ba27ba21dec44d69cf0f) |
| Comment by Githook User [ 21/Mar/23 ] |
|
Author: {'name': 'Christopher Caplinger', 'email': 'christopher.caplinger@mongodb.com', 'username': 'UnicodeSnowman'}Message: |