[SERVER-55353] Tenant migration recipient can fail to find pre-image or post-image for oplog entry in its tenant migration oplog buffer Created: 19/Mar/21  Updated: 29/Oct/23  Resolved: 28/Apr/21

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.9.0-rc1, 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Cheahuychou Mao Assignee: Lingzhi Deng
Resolution: Fixed Votes: 0
Labels: pm-1791_non-cloud-blocking, post-rc0
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-52713 [testing] Add stepdown/kill/terminate... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.9
Sprint: Repl 2021-04-19, Repl 2021-05-03
Participants:

 Description   

Based on the EVG patch builds for SERVER-52713, tenant migration recipient can fail to find pre-image or post-image for oplog entry in config.repl.migration.oplog, causing the migration to abort with NoSuchKey error.

[j1:rs1:n0] | 2021-03-19T01:36:30.452+00:00 D1 TENANT_M 5351004 [TenantMigrationRecipientService-3] "Tenant Oplog Batcher reordering pre- or post- image for oplog entry","attr":{"opTime":{"ts":{"$timestamp":{"t":1616117790,"i":28}},"t":2},"imageOpTime":{"ts":{"$timestamp":{"t":1616117790,"i":27}},"t":2}}
[j1:rs1:n0] | 2021-03-19T01:36:30.453+00:00 I  REPL     4878501 [TenantMigrationRecipientService-3] "Tenant migration recipient instance: Data sync completed.","attr":{"tenantId":"tenantMigrationTenantId","migrationId":{"uuid":{"$uuid":"81ad6fb6-afa2-4567-a156-dbe6428605fe"}},"error":{"code":4,"codeName":"NoSuchKey","errmsg":"No document found with _id: _id: { ts: Timestamp(1616117790, 27) } in namespace config.repl.migration.oplog_81ad6fb6-afa2-4567-a156-dbe6428605fe"}}
buildscripts.resmokelib.errors.ServerFailure: Tenant migration with donor replica set 'rs0' aborted due to an error: {'state': 'aborted', 'abortReason': {'code': 4, 'codeName': 'NoSuchKey', 'errmsg': 'Tenant migration recipient command failed :: caused by :: No document found with _id: _id: { ts: Timestamp(1616117790, 27) } in namespace config.repl.migration.oplog_81ad6fb6-afa2-4567-a156-dbe6428605fe'}, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1616117790, 48), 'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 'keyId': 0}}, 'operationTime': Timestamp(1616117790, 48)}



 Comments   
Comment by Githook User [ 28/Apr/21 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-55353: Ignore pre/post image when tenant oplog batcher can't find one

(cherry picked from commit 0464f0b7bcb2eee8dafbf108447013b4d586a16a)
Branch: v4.9
https://github.com/mongodb/mongo/commit/7bc3b87610f5abc2ee3e9900c955d1d19b3b46fd

Comment by Githook User [ 28/Apr/21 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-55353: Ignore pre/post image when tenant oplog batcher can't find one
Branch: master
https://github.com/mongodb/mongo/commit/0464f0b7bcb2eee8dafbf108447013b4d586a16a

Generated at Thu Feb 08 05:36:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.