[SERVER-36562] Investigate log messages when running election handoff in multiversion cluster Created: 09/Aug/18  Updated: 21/Sep/18  Resolved: 21/Sep/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Gregory McKeon (Inactive) Assignee: Vesselina Ratcheva (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Sprint: Repl 2018-08-27, Repl 2018-09-10, Repl 2018-09-24, Repl 2018-10-08
Participants:

 Description   

Look into and advise support about expected log messages when running election handoff in a multi-version cluster.



 Comments   
Comment by Vesselina Ratcheva (Inactive) [ 21/Sep/18 ]

Election handoff was build on top of existing replSetStepUp functionality, so it only needs to be available on the primary for to work. Skipping the dry run is the only part that also affects the secondary (it lands in later minor versions), but it is only an optional optimization (this was nevertheless investigated as well). Because of this and the generous backports, you can use this feature with a relatively broad range of minor version pairs. I ended up with 11 just for the core functionality. Thankfully, I didn't observe anything actually surprising in the logs. In terms of log messages, an election handoff is thus always in the form of:

  • on the primary:

    [...] Attempting to step down in response to replSetStepDown command
    [...] transition to SECONDARY from PRIMARY
    [...] Handing off election to <secondary host:port>
    

  • on the secondary (without skipping the dry run):

    [...] Received replSetStepUp request
    [...] Starting an election due to step up request
    [...] conducting a dry run election to see if we could be elected. current term: <N>
    [...] dry election run succeeded, running for election in term <N+1>
    [...] election succeeded, assuming primary role in term <N+1>
    

    etc

OR

  • on the primary (same messages as above):

    [...] Attempting to step down in response to replSetStepDown command
    [...] transition to SECONDARY from PRIMARY
    [...] Handing off election to <secondary host:port>
    

  • on the secondary (with skipping the dry run):

    [...] Received replSetStepUp request
    [...] Starting an election due to step up request
    [...] skipping dry run and running for election in term <N+1>
    [...] election succeeded, assuming primary role in term <N+1>
    

    etc

The core functionality is available as of 3.6.7, 4.0.2 and 4.1.2, and the additional optimization to skip the dry run is part of 3.6.9, 4.0.3 and 4.1.4.

Generated at Thu Feb 08 04:43:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.