[SERVER-45077] Add a new “term” field to the config document Created: 12/Dec/19  Updated: 29/Oct/23  Resolved: 08/Jan/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.3.3

Type: Task Priority: Major - P3
Reporter: Siyuan Zhou Assignee: William Schultz (Inactive)
Resolution: Fixed Votes: 0
Labels: safe-reconfig-consensus
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-45089 Make sure a missing “term” field is t... Closed
Related
related to SERVER-45092 Remove “term” field of config documen... Closed
is related to SERVER-45408 Enable serialization of config "term"... Closed
Backwards Compatibility: Fully Compatible
Sprint: Repl 2019-12-30, Repl 2020-01-13
Participants:

 Comments   
Comment by Githook User [ 07/Jan/20 ]

Author:

{'name': 'William Schultz', 'email': 'william.schultz@mongodb.com', 'username': 'will62794'}

Message: SERVER-45077 Add a "term" field to ReplSetConfig
Branch: master
https://github.com/mongodb/mongo/commit/1a4fb8445254a73adabad98eafbb5a8fc0e2ae05

Comment by Siyuan Zhou [ 19/Dec/19 ]

To fix this, 4.4 nodes in FCV=4.2 could omit the 'term' field in the configs they send in heartbeat responses.

FCV upgrade only needs to commit on a majority nodes, so even if FCV=4.4, there might still be nodes with FCV 4.2.  Maybe that's fine since at that moment, binVersion 4.2 nodes will fassert on learning FCV 4.4, so we could make 4.4 node with FCV 4.2 compatible with config terms.

Comment by William Schultz (Inactive) [ 19/Dec/19 ]

To solve the downgrade issue before SERVER-45092 is done, we could add a temporary failpoint that causes the server to always omit the term in configs. We could enable this failpoint in the few tests that exercise downgrade. Then, once SERVER-45092 is done, we would remove this failpoint and let the tests run normally again.

Comment by William Schultz (Inactive) [ 19/Dec/19 ]

While testing these initial changes I ran into a multiversion issue where 4.2 nodes reject configs with a 'term' field in them because they parse config objects strictly. The set of allowed top-level fields is explicitly defined here. We will fail to install a config sent to us in a heartbeat response when we attempt to parse it here. Note that this strict parsing behavior also causes problems for downgrade, but we already have plans to explicitly deal with that issue (SERVER-45092).

To fix this, 4.4 nodes in FCV=4.2 could omit the 'term' field in the configs they send in heartbeat responses. That way a binVersion=4.2 node could receive configs via heartbeats from a 4.4 node safely. Alternatively, we could consider increasing the _heartbeatVersion field in 4.4 and having 4.4 nodes omit the config 'term' field in their heartbeat responses if the heartbeat request is from a lower heartbeat version. This seems more invasive, though, and feels more or less equivalent to the FCV based solution.

Generated at Thu Feb 08 05:07:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.