[SERVER-49941] Handle writes (updates and removes) during tenant migrations that do not generate oplog entries because they were a no-op Created: 28/Jul/20  Updated: 29/Oct/23  Resolved: 17/Mar/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Task Priority: Major - P3
Reporter: Cheahuychou Mao Assignee: Jack Mulrow
Resolution: Fixed Votes: 0
Labels: pm-1791_milestone-A
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: Sharding 2021-01-25, Sharding 2021-02-22, Sharding 2021-03-08, Sharding 2021-03-22
Participants:

 Description   

If a write does not generate an oplog entry, it must be because any earlier writes this write depends on is already reflected in the data (lingzhi.deng , could you confirm?).

This means any writes this write depends on must have been assigned an OpTime earlier than blockTimestamp, at least on this primary's branch of history.

If the write came with w:majority, the write will wait for the system last OpTime to be majority committed, which must include the blockTimestamp. So, if the migration commits, the write is guaranteed to be reflected on the recipient.

If the write came with w < majority, the write is not guaranteed to be reflected in the database anyway, so it is ok if it is not reflected on the recipient.

So, there should be no need to do anything special for such writes, though jack.mulrow, it may be good to confirm that causal consistency will be respected for such writes.



 Comments   
Comment by Githook User [ 17/Mar/21 ]

Author:

{'name': 'Jack Mulrow', 'email': 'jack.mulrow@mongodb.com', 'username': 'jsmulrow'}

Message: SERVER-49941 Add core test for no-op writes
Branch: master
https://github.com/mongodb/mongo/commit/6cc9d99708e73f736c9af13ab4f04f0ef5594aff

Comment by Lingzhi Deng [ 11/Mar/21 ]

Just want to point out something very subtle here: it is not always correct to return lastApplied as the operationTime for noop writes because of the issue described in SERVER-39364. If a write becomes a no-op because of a concurrent write, the lastApplied could be stale because the update to the lastApplied is in an on-commit hook after the data changes are made. So in that case, we would instead hit this which should return the lastOp of the client and that should be set to the timestamp of the last oplog entry in the oplog. But yes, the operationTime should be inclusive of the earlier operation that performed the same modification.

Comment by Jack Mulrow [ 11/Mar/21 ]

As for respecting causal consistency, the operationTime in the response for a noop write will always be inclusive of the earlier operation that already performed its modification to the data (it would be set to the ReplicationCoordinator's last applied opTime here after executing), so any subsequent afterClusterTime reads should correctly interact with the migration like any other read with afterClusterTime.

Comment by Lingzhi Deng [ 30/Nov/20 ]

I think so. I think if an user write becomes an noop, it means the same modification to the data was already performed by an earlier operation.

Generated at Thu Feb 08 05:21:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.