[SERVER-37340] Make FCV upgrade use command oplog entry instead of observing writes to the FCV document Created: 27/Sep/18 Updated: 21/Feb/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Upgrade/Downgrade |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | pm-2821-quick-wins | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||
| Participants: | |||||||||||||
| Description |
|
In the current FCV upgrade logic, the primary node in a replica set executes a command and has the opportunity to acquire strong locks for consistency. However, the secondaries only get to observe writes to the FCV document which are executed while already holding a hierarchy of locks and as a result, they are not allowed to take any further locks on their own out of risk of introducing deadlocks. During 4.0 development we uncovered cases ( |
| Comments |
| Comment by Kaloian Manassiev [ 30/Jun/22 ] |
|
Since FCV is a macro operation that potentially impacts the entire instances, it should not be using OpObservers on documents, at the very least because of the locking problem. Because of this, I still believe there is merit to make it use 'c' oplog entries. Passing this ticket to the replication team, since they own the FCV infrastructure, to possibly be done under PM-2821. |
| Comment by Max Hirschhorn [ 12/Nov/21 ] |
The changes from ff982a6 as part of I'm not sure this entirely solves the lock upgrade problem mentioned in the description because there can still be readers on secondaries. Assigning this to Sharding EMEA because they had done work on setFCV in the MongoDB 5.0 and I'd like more input from Kal. |