[SERVER-61666] Tenant migration fails to fetch all txn oplog entries for a txn with commit opTime equal to startFetchingDonorOpTime Created: 19/Nov/21  Updated: 29/Oct/23  Resolved: 18/Dec/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.3.0-rc0

Type: Task Priority: Major - P3
Reporter: Janna Golden Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File repro_tenant_migrations_no_such_key.js    
Issue Links:
Backports
Depends
Problem/Incident
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.2, v5.1
Participants:
Linked BF Score: 183

 Description   

When getting the start fetch time from a donor during a tenant migration, if the last oplog entry on the donor is for a commit transaction and this transaction spanned multiple oplog entries, the recipient will not fetch the transaction's previous oplog entries. Currently, if a transaction commits before the start fetch timestamp, we walk its oplog chain to fetch all oplog entries in the transaction. For any transactions that haven't yet committed, we'll change the start fetch time to be equal to startOpTime of the transaction to make sure we fetch all oplog entries for any open transactions. However, if a transaction commits at exactly the start fetch op time, we skip fetching its chain entirely. When applying oplog entries, the recipient will attempt to apply the commit transaction entry, but won't have fetched any previous oplog entries for the transaction and will fail with NoSuchKey, aborting the transaction.

A repro is attached.



 Comments   
Comment by Githook User [ 23/Dec/21 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-61666 Exclude migration test from multiversion
Branch: v5.2
https://github.com/mongodb/mongo/commit/4492e86c8bbff4098c929b613a53f1029e3bfda0

Comment by Githook User [ 18/Dec/21 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-61666 Fix migration for txns before startApplying
Branch: master
https://github.com/mongodb/mongo/commit/d886f5d13232ab479dc73f24f5c80f2422326f3e

Comment by A. Jesse Jiryu Davis [ 17/Dec/21 ]

Previous attempt was reverted because I forgot to add the new test to backports_required_for_multiversion_tests.yml.

Comment by Githook User [ 15/Dec/21 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: Revert "SERVER-61666 Fix migration for txns before startApplying"

This reverts commit ea6a59377c01ed48157557aaaae0bd8191b7fa4e.
Branch: master
https://github.com/mongodb/mongo/commit/57db2b83401812f2899a4c02bd896e1c8b6ab046

Comment by Githook User [ 06/Dec/21 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-61666 Fix migration for txns before startApplying
Branch: master
https://github.com/mongodb/mongo/commit/ea6a59377c01ed48157557aaaae0bd8191b7fa4e

Comment by Esha Maharishi (Inactive) [ 30/Nov/21 ]

jesse I believe we said if the fix is involved, merge would not have this problem, and it only causes a tenant migration to abort (does not cause data corruption), we could do a quick fix to prevent the BF. However, I'm not sure how we could do a quick fix, since the issue can happen in any test in tenant_migration_multi_stmt_txn_jscore_passthrough. Therefore, I think we need to implement an actual fix.

Comment by A. Jesse Jiryu Davis [ 29/Nov/21 ]

suganthi.mani could you please attach your repro for initial sync? It would be useful to test that we don't break initial sync while fixing tenant migrations.

esha.maharishi can you please remind me what we decided about this ticket? We considered waiting to see if PM-2353 would fix it without effort, but I think we decided we must fix this now, because .... ?

Comment by Suganthi Mani [ 29/Nov/21 ]

Just adding more insights on it for the future reference, we use the same logic even for initial sync to calculate the begin(start) fetching and begin(start) applying timestamp. So, this should be a problem even for the initial sync. I wrote a quick repro simulating the scenario mentioned here for the initial sync, but initial sync didn't hit the problem. Looking closely into the initial sync code revealed that this piece of check prevented the initial code from hitting the NoSuchKey error issue.

The rule for both initial sync and tenant migration is that we should start replaying oplog entries from Timestamp > begin(start) applying Timestamp.

In the initial sync, we expand the unprepared commit transaction oplog entries after the apply timestamp check, by the oplog applier. But, in the tenant migration, the apply timestamp check is in the tenant oplog applier but the expansion happens way earlier, during the oplog batching stage. And, that's not correct. (To be noted, we can end up having a  scenario like this : Txn1 start opTime < startFetchingOpTime < Txn1 (unprepared) commit opTime < startApplyingOpTime and we would hit the same problem). So, I feel, Tenant oplog batcher should only expand (unprepared commit) oplog entries with opTime  > StartApplyingDonorOpTime.

Generated at Thu Feb 08 05:53:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.