Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.9.0
Affects Version/s: None
Component/s: Replication
Labels:
- pm-1791_milestone-B

Backwards Compatibility:
Fully Compatible
Sprint:
Repl 2020-10-05, Repl 2020-10-19, Repl 2020-11-02
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Consider the following scenario:

We start migrating tenant X
The migration sets a start timestamp of TS(100)
When the tenant cloners complete, the last write on the donor for tenant X is TS(90) and tenant Y is TS(150)
TS(150) is the read concern majority optime on the donor, and thus is the ‘lastVisibleOpTime’ that the recipient receives.The recipient thus sets its 'stopTimestamp’ to TS(150).
The last oplog entry fetched on the recipient is at TS(90)

The recipient will never apply an oplog entry with a timestamp greater than or equal to TS(150), and thus will never think it’s consistent.

To fix this, we make sure that the tenant oplog applier writes a noop oplog entry into its oplog buffer whenever it receives a batch. We must be careful however, that this noop entry is not too high. If the recipient wrote the ‘lastVisibleOpTime’ as a noop, then if the recipient were lagged, that noop could make it appear as though the recipient were actually more up to date than it actually is. The correct value is the “latest oplog timestamp the donor sees when doing its oplog query”. This is exactly what the TRACK_LATEST_OPLOG_TS query parameter includes in the query response, with the postBatchResumeToken.

We write these noops for empty batches as well since it should be simple to ignore duplicate timestamps in the oplog buffer and it will ensure the recipient does not need to rescan oplog entries on recovery that it filtered out previously.

Resharding faces this analogous problem, but is solving it in aggregation since they use aggregation rather than find commands. We must correctly expose this resume token for find commands in ~~SERVER-51227~~, and then write and process the noops in this ticket.

depends on

SERVER-51227 Make find/getMore cmd with $_requestResumeToken on oplog collection to report latest oplog entry ts instead of the latest record id seen while generating the response batch.

Closed

is depended on by

SERVER-51734 Enable tenant migration recipient testing.

Closed

is related to

SERVER-52628 Tenant migration recipient can give a false indication to donor about the data being majority committed on recipient replica set.

Closed

SERVER-49897 Insert no-op entries into oplog buffer collections for resharding so resuming is less wasteful

Closed

related to

SERVER-61440 Race in tenant_migration_recipient_current_op.js

Closed

Assignee:: Judah Schvimer
Reporter:: Judah Schvimer
Participants:: Githook User, Judah Schvimer
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Sep 30 2020 07:19:04 PM UTC
Updated:: Oct 29 2023 10:02:36 PM UTC
Resolved:: Oct 29 2020 03:51:29 PM UTC
Confidence Status Last Update:: 08/Oct/20 3:17 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates