[SERVER-46516] Majority write concern is blocked by dropping snapshot on reconfig Created: 01/Mar/20 Updated: 06/Dec/22 Resolved: 08/Feb/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Siyuan Zhou | Assignee: | Backlog - Replication Team |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | safe-reconfig-related | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Revert the change of write concern for setFCV and the check of config content change before dropping snapshot in |
||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2020-03-23, Repl 2020-04-06 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 60 | ||||||||||||||||||||||||||||||||||||
| Description |
|
On reconfig, the primary drops all snapshots used to serve majority reads since the quorum may change by reconfig. The _currentCommittedSnapshot should be recalculated very soon, after a round of heartbeats in the worst case. However it's not the case. Majority writes are blocked until they are interrupted. We need to confirm that the _currentCommittedSnapshot is updated correctly after dropping snapshots and the waiters are signaled correctly. |
| Comments |
| Comment by Steven Vannelli [ 08/Feb/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Since we are doing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Pavithra Vetriselvan [ 05/Feb/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Re-opening this ticket because We should revisit our contract when dropping snapshots and the assumptions we can make based off of that. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 30/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Closing in favor of | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 26/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Sure, I can move the noop write to doReplSetReconfig in order to address the TODO in _dropAllSnapshots_inlock is also called after rollback and when we drop replicated databases in initial sync. The call after rollback is no longer needed, but it hasn't caused a problem that I'm aware of. I think there's a larger problem that our language about snapshots is confusing. I proposed that in the pain point spreadsheet for cleanup. That's correct that we don't update _currentCommittedSnapshot on journal flush when running with journaling disabled, but we will still update it due to replSetUpdatePosition due to the noop write. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Siyuan Zhou [ 25/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks tess.avitabile for the detailed investigation! I'd propose to move the no-op write to doReplSetReconfig to make the behavior consistent. I'm wondering if this is a more general problem since _dropAllSnapshots_inlock is also called in other places. Another question is when _rsConfig.getWriteConcernMajorityShouldJournal() is false, we don't update _currentCommittedSnapshot on journal flush. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Judah Schvimer [ 25/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
There will be a third caller soon for removing 'newlyAdded' fields for ISS in | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 25/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I found the root cause for this issue, and it only affects internal callers of ReplicationCoordinator::doReplSetReconfig(). _currentCommittedSnapshot gets updated in two ways:
It is not updated by heartbeats. When the replSetReconfig command is run, after we drop snapshots, the command writes a noop oplog entry. This triggers a round of replSetUpdatePosition commands, which restores the _currentCommittedSnapshot. However, when doReplSetReconfig() is called internally, no oplog entry is written, so _currentCommittedSnapshot is unset until the next write. I don't believe this leads to any bugs. doReplSetReconfig() only has two internal callers:
Since there is no bug, I'm inclined to close this issue as Gone Away. Now that we understand the issue, we could file a ticket for code clean-up to change how we do the workarounds. siyuan.zhou, how does that sound to you? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 25/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I can reproduce this issue with the following diff. The reconfig in the test is to ensure that the config term gets initialized, so that downgrading FCV needs to remove the config term. (See
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 25/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Ah, my guess is that it's because we don't need to do a reconfig on downgrade FCV if the term is uninitialized, and | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 25/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The issue was resolved by | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 25/Mar/20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Oddly, making the change in the Steps to Reproduce on the tip of master does not reproduce the bug, but if I reset back to this commit then make the change, the problem does reproduce. siyuan.zhou, do you know of any commits in the meantime that might have affected this behavior? |