[SERVER-42219] Oplog buffer not always empty when primary exits drain mode Created: 12/Jul/19 Updated: 29/Oct/23 Resolved: 23/Aug/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.2.0-rc2 |
| Fix Version/s: | 4.2.1, 4.3.1, 4.0.17 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Samyukta Lanka | Assignee: | Siyuan Zhou |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v4.2, v4.0, v3.6
|
||||||||||||||||||||
| Steps To Reproduce: |
|
||||||||||||||||||||
| Sprint: | Repl 2019-08-12, Repl 2019-08-26, Repl 2019-09-09 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 68 | ||||||||||||||||||||
| Description |
|
If a new primary is in drain mode and the thread getting the next batch from the oplog buffer is slow to run, then it can exit drain mode prematurely here because it didn't get a new batch after 1 second. This is problematic because the oplog buffer could still have oplog entries for the node to apply. Once the node exits drain mode, it will write an oplog entry in the new term. Since we don't stop the thread running oplog application when we exit drain mode, it could then get a new batch of oplog entries that are before the new term oplog entry. When it tries to apply them, it will lead to this fassert because we cannot apply oplog entries that are before our lastApplied. |
| Comments |
| Comment by Githook User [ 26/Feb/20 ] |
|
Author: {'username': 'visualzhou', 'name': 'Siyuan Zhou', 'email': 'siyuan.zhou@mongodb.com'}Message: (cherry picked from commit 883b10b38ddd7aa5b9a197688141ebf387292a07) |
| Comment by Githook User [ 26/Feb/20 ] |
|
Author: {'name': 'Siyuan Zhou', 'username': 'visualzhou', 'email': 'siyuan.zhou@mongodb.com'}Message: Reverted a5cbd93aea (cherry picked from commit 89a6d7bc3a0126cf8bfd177ad65b233181641175) |
| Comment by Githook User [ 05/Sep/19 ] |
|
Author: {'name': 'Siyuan Zhou', 'username': 'visualzhou', 'email': 'siyuan.zhou@mongodb.com'}Message: (cherry picked from commit 883b10b38ddd7aa5b9a197688141ebf387292a07) |
| Comment by Githook User [ 05/Sep/19 ] |
|
Author: {'username': 'visualzhou', 'email': 'siyuan.zhou@mongodb.com', 'name': 'Siyuan Zhou'}Message: Reverted a5cbd93aea (cherry picked from commit 89a6d7bc3a0126cf8bfd177ad65b233181641175) |
| Comment by Githook User [ 23/Aug/19 ] |
|
Author: {'username': 'visualzhou', 'email': 'siyuan.zhou@mongodb.com', 'name': 'Siyuan Zhou'}Message: |
| Comment by Githook User [ 23/Aug/19 ] |
|
Author: {'username': 'visualzhou', 'email': 'siyuan.zhou@mongodb.com', 'name': 'Siyuan Zhou'}Message: Reverted a5cbd93aea |
| Comment by Siyuan Zhou [ 16/Jul/19 ] |
|
This was found in "Flow Control" test suites, which slows down the node. The current logic assumes if any data is available in the buffer, getNextBatch() will see it in a second. This isn't always true if the threads are scheduled late. |
| Comment by Samyukta Lanka [ 15/Jul/19 ] |
|
judah.schvimer I think this has existed for a while, but recent test coverage has uncovered it. |
| Comment by Judah Schvimer [ 15/Jul/19 ] |
|
samy.lanka, is this a regression or has this existed for a while? |
| Comment by Samyukta Lanka [ 15/Jul/19 ] |
|
siyuan.zhou Sorry, the links should be fixed now |
| Comment by Siyuan Zhou [ 13/Jul/19 ] |
|
samy.lanka, the links in "Steps To Reproduce" are broken. Could you please post or attach the patch to reproduce the bug? william.schultz, good point. I think we can address the problem by coming up with a better interface of getNextBatch(), which also solves |
| Comment by William Schultz (Inactive) [ 12/Jul/19 ] |
|
Interesting. This sounds related to the issue hypothesized in |
| Comment by Samyukta Lanka [ 12/Jul/19 ] |
|
One possible solution that siyuan.zhou and I discussed is to check both if the next batch is empty and the oplog buffer is empty here. This should be safe during drain mode because the replication producer thread is stopped when we enter drain mode and once it is stopped, we ensure that no new oplog entries are added to the oplog buffer. |