[SERVER-32729] Bump the replica set heartbeat version so a latest-FCV primary does not attempt to talk to a last-stable binary secondary Created: 16/Jan/18  Updated: 25/Jan/18  Resolved: 25/Jan/18

Status: Closed
Project: Core Server
Component/s: Upgrade/Downgrade
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Maria van Keulen Assignee: Dianna Hohensee (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done after SERVER-32412 Add featureCompatibilityVersion 4.0 t... Closed
Related
related to SERVER-32636 Close outgoing connections to servers... Closed
Sprint: Storage 2018-01-29
Participants:

 Description   

For the 3.6 release cycle, we bumped the replica set heartbeat version so a latest-FCV primary wouldn't think a last-stable binary version secondary was healthy (see SERVER-31631). The heartbeat version should be similarly bumped for 3.8.



 Comments   
Comment by Dianna Hohensee (Inactive) [ 25/Jan/18 ]

The temporarily addition of heartbeat version for v3.6 works because v3.4 validates that no unexpected fields are sent. So v3.6 nodes do no heartbeat version checking, they simply send it. This won't work for v3.8, because v3.6 expects it. Therefore simply bumping the version will not be sufficient: a check would need to be added.

In light of this, we are not going to bump the version and add checks. We'll simply wait for SERVER-32636 to resolve this. I believe multiVersion/feature_compatibility_version_lagging_secondary.js is the only test for this. It still passes because it's explicitly using a v3.4 binary node with latest binary replica set. Adding a note to SERVER-32636 to update the test, so it becomes relevant again, and closing this ticket.

Comment by Dianna Hohensee (Inactive) [ 22/Jan/18 ]

Must be done after SERVER-32412, so that we have FCV versions greater than 3.6 on which to use the new heartbeat version.

Comment by Dianna Hohensee (Inactive) [ 22/Jan/18 ]

Latest and last-stable binary nodes are unable to communicate due to internal wireVersion requirements. A FCV upgraded latest binary will have a range of

{LATEST_WIRE_VERSION, LATEST_WIRE_VERSION}

, whereas a last-stable binary will

{LATEST_WIRE_VERSION - 1, LATEST_WIRE_VERSION - 1}

, and no range overlap causes a IncompatibleServerVersion error.

The only situation where mixed version replica set members can continue to communicate is on connections established before FCV was upgraded on the latest binary nodes. We already track and close incoming connections from last-stable binaries on upgrade. SERVER-32636 will eventually close outgoing connections.

This ticket will be done to hold us over until SERVER-32636 in unblocked and completed.

Generated at Thu Feb 08 04:31:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.