[SERVER-78813] Commit point propagation fails indefinitely with exhaust cursors with null lastCommitted optime Created: 10/Jul/23  Updated: 19/Jan/24  Resolved: 07/Aug/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0, 7.0.1, 5.0.20, 6.0.9, 4.4.25

Type: Bug Priority: Major - P3
Reporter: Lingzhi Deng Assignee: Lingzhi Deng
Resolution: Fixed Votes: 0
Labels: repl-shortlist
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
causes SERVER-79885 Oplog fetching getMore should not set... Closed
Related
related to SERVER-58721 processReplSetInitiate does not set a... Closed
related to SERVER-53813 Avoid serving stale majority reads on... Open
related to SERVER-68514 Delay announcement of new primary unt... Backlog
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0, v6.0, v5.0, v4.4
Sprint: Repl 2023-07-24
Participants:
Case:
Linked BF Score: 70

 Description   

If a (syncing) node has a null lastCommitted optime when it issues its exhaust getMore request to its sync source, it will not receive empty oplog branches for commit point propagation forever. This is because we don't take into account or update the last known lastCommitted optime of an exhaust cursor if it's originally null (see here and here).

Normally, a syncing node would get its sync source's lastCommitted optime from the first oplog find request. But if the sync source node also had a null lastCommitted optime when the syncing node started to sync from it, then I think we could end up in the situation I mentioned above. The elected primary normally relies on the JournalFlusher to trigger the first lastCommitted calculation/update. And this could be delayed because the JournalFlusher is run asynchronously. This could happen after replSetInitiate on 4.4 (see SERVER-58721) or after node restart on all versions.



 Comments   
Comment by Githook User [ 22/Aug/23 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-78813: Fix commit point propagation for exhaust oplog cursors

(cherry picked from commit 12f70e5fdff32cd733d1fde7a651bdfbae389e8b)
Branch: v7.0
https://github.com/mongodb/mongo/commit/9db2af46617eedca6bcf1d8c0851cfad04061f5c

Comment by Githook User [ 22/Aug/23 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-78813: Fix commit point propagation for exhaust oplog cursors

(cherry picked from commit 12f70e5fdff32cd733d1fde7a651bdfbae389e8b)
Branch: v4.4
https://github.com/mongodb/mongo/commit/203815eadec5c50407abc1edf6db35ad4c00dbef

Comment by Githook User [ 15/Aug/23 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-78813: Fix commit point propagation for exhaust oplog cursors

(cherry picked from commit 12f70e5fdff32cd733d1fde7a651bdfbae389e8b)
Branch: v5.0
https://github.com/mongodb/mongo/commit/54b53e8be245bd81ecbef6f69d6839d8a3b2908d

Comment by Githook User [ 07/Aug/23 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: Revert "SERVER-78813: Fix commit point propagation for exhaust oplog cursors"

This reverts commit 2f36b8afb61df115001b5c7b201d98a4a227fca4.
Branch: v5.0
https://github.com/mongodb/mongo/commit/9f727bde6c0ae42265f7d4e5e281df52acef1b2a

Comment by Githook User [ 31/Jul/23 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-78813: Fix commit point propagation for exhaust oplog cursors

(cherry picked from commit 12f70e5fdff32cd733d1fde7a651bdfbae389e8b)
Branch: v5.0
https://github.com/mongodb/mongo/commit/2f36b8afb61df115001b5c7b201d98a4a227fca4

Comment by Githook User [ 31/Jul/23 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-78813: Fix commit point propagation for exhaust oplog cursors

(cherry picked from commit 12f70e5fdff32cd733d1fde7a651bdfbae389e8b)
Branch: v6.0
https://github.com/mongodb/mongo/commit/fba5855b0147493c50806e92142ac559b923e3a8

Comment by Githook User [ 19/Jul/23 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-78813: Fix commit point propagation for exhaust oplog cursors
Branch: master
https://github.com/mongodb/mongo/commit/12f70e5fdff32cd733d1fde7a651bdfbae389e8b

Comment by Lingzhi Deng [ 13/Jul/23 ]

After a second look at this code, I found that it is actually possible for the sync source node to reset the exhaust cursor's lastKnownCommittedOpTime back to null if the sync source's lastCommittdeOpTime is null. This means that even if the syncing node sends a non-null lastKnownCommittedOpTime to begin with, there is still chances for the sync source node to mistakenly set it to null, after which the commit point propagation (via empty batches) between the two nodes is terminated.

Comment by Lingzhi Deng [ 10/Jul/23 ]

One easy solution is to consider a null optime as the smallest lastCommitted optime and always update the last known committed optime for oplog exhaust cursors to be the commit point returned in the last batch. I think we checked for null initially only to differentiate external oplog queries from internal oplog fetching queries, so that we don't opt into commit propagation unnecessarily for external oplog queries in the absence of the last known committed optime. But I think we can make that differentiation based on present of the metadata "$replData" / "$oplogQueryData".

Generated at Thu Feb 08 06:39:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.