[SERVER-48467] Handle quiesce mode in mixed version replica sets Created: 28/May/20 Updated: 29/Oct/23 Resolved: 26/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.7.0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Tess Avitabile (Inactive) | Assignee: | Pavithra Vetriselvan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Sprint: | Repl 2020-06-15, Repl 2020-06-29 |
| Participants: |
| Description |
|
Due to the findings in |
| Comments |
| Comment by Githook User [ 26/Jun/20 ] |
|
Author: {'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}Message: |
| Comment by Evin Roesle [ 25/Jun/20 ] |
|
Makes sense to me. More uniform behavior means easier for the user to understand when they should expect a certain behavior so I am also all for that idea |
| Comment by Pavithra Vetriselvan [ 25/Jun/20 ] |
|
Ah, got it. Thank you for explaining! I would also prefer to ignore the parameter. |
| Comment by Tess Avitabile (Inactive) [ 25/Jun/20 ] |
|
Yes, this is what I meant by "ignoring" the parameter–we'll just skip quiesce mode. Similar to the option of banning vs ignoring the timeoutSecs parameter for the shutdown command, there's a question of whether to ban or ignore the shutdownTimeoutMillis parameter for mongos, but in this case "banning" would be requiring this parameter is 0. I prefer to ignore the parameter when FCV < 4.6. Does that make sense? |
| Comment by Pavithra Vetriselvan [ 24/Jun/20 ] |
|
That makes sense to me! I agree that skipping quiesce mode on mongos and mongod if FCV < 4.6 provides a more uniform behavior. It looks like timeoutSecs and the server parameters will be unused by quiesce mode if we check for FCV before entering quiesce mode on the server. Just double checking that this is what you meant by "ignoring" the parameter. As you said, they would only be used for the stepdown timeout. I'm a little confused by what you mean by requiring/not requiring that the shutdownTimeoutMillis server parameters are 0, though. |
| Comment by Tess Avitabile (Inactive) [ 24/Jun/20 ] |
|
Yes, FCV-gating the feature sounds good to me! It's not quite the case that we'll block users from using quiesce mode if they have FCV < 4.6, since quiesce mode happens by default. Instead, I would say this the feature is turned off if FCV < 4.6. There are a few choices I think we need to make:
evin.roesle, I want to let you know about the above design choices for FCV-gating quiesce mode. Yes, those sound like the correct places to check FCV. Though there may be nothing to do for the cases of attaching topologyVersion, since inQuiesceMode() will return false. |
| Comment by Pavithra Vetriselvan [ 23/Jun/20 ] |
|
tess.avitabile Based on our conversation with Cloud, it seems like the simplest solution for everyone would be to FCV gate quiesce mode to 4.6. Atlas doesn't allow mixed version sets and Cluster Manager/Ops Manager can block users from using Quiesce Mode if they have < FCV 4.6. After a quick run-through of the code we added for this project, the following places seem to be where we should check FCV:
Is there anything else that I'm missing? |