Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-70510

Avoid considering recovering nodes as electable

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major - P3
    • Resolution: Unresolved
    • None
    • None
    • Replication
    • Service Arch
    • ALL
    • Hide

      This is reproducible by running the “too_stale_secondary.js” test after applying the following patch:

      diff --git a/jstests/replsets/too_stale_secondary.js b/jstests/replsets/too_stale_secondary.js
      index 1ee6400ebc4..d8bc2c0f11e 100644
      --- a/jstests/replsets/too_stale_secondary.js
      +++ b/jstests/replsets/too_stale_secondary.js
      @@ -93,8 +93,8 @@ replTest.initiate({
           _id: testName,
           members: [
               {_id: 0, host: nodes[0].host},
      -        {_id: 1, host: nodes[1].host, priority: 0},
      -        {_id: 2, host: nodes[2].host, priority: 0}
      +        {_id: 1, host: nodes[1].host, priority: 1},
      +        {_id: 2, host: nodes[2].host, priority: 1}
           ]
       });
       
      @@ -139,6 +139,17 @@ assert.soon(() => myState(replTest.nodes[2]) === ReplSetTest.State.RECOVERING,
       // This waits for the state as indicated by the primary node.
       replTest.waitForState(replTest.nodes[2], ReplSetTest.State.RECOVERING);
       
      +jsTestLog("Begin test");
      +assert.commandWorked(
      +    replTest.getPrimary().adminCommand({setParameter: 1, mirrorReads: {samplingRate: 1.0}}));
      +
      +for (var i = 0; i < 100; ++i) {
      +    primaryTestDB.runCommand({find: collName, filter: {}});
      +}
      +jsTestLog("Mid test");
      +replTest.nodes[2].getDB('test').runCommand({find: 'test', filter: {}});
      +jsTestLog("End test");
      +
       jsTestLog("7: Stop and restart Node 2.");
       replTest.stop(2);
       replTest.restart(2, {
      

      Show
      This is reproducible by running the “too_stale_secondary.js” test after applying the following patch: diff --git a/jstests/replsets/too_stale_secondary.js b/jstests/replsets/too_stale_secondary.js index 1ee6400ebc4..d8bc2c0f11e 100644 --- a/jstests/replsets/too_stale_secondary.js +++ b/jstests/replsets/too_stale_secondary.js @@ -93,8 +93,8 @@ replTest.initiate({ _id: testName, members: [ {_id: 0, host: nodes[0].host}, - {_id: 1, host: nodes[1].host, priority: 0}, - {_id: 2, host: nodes[2].host, priority: 0} + {_id: 1, host: nodes[1].host, priority: 1}, + {_id: 2, host: nodes[2].host, priority: 1} ] }); @@ -139,6 +139,17 @@ assert.soon(() => myState(replTest.nodes[2]) === ReplSetTest.State.RECOVERING, // This waits for the state as indicated by the primary node. replTest.waitForState(replTest.nodes[2], ReplSetTest.State.RECOVERING); +jsTestLog("Begin test"); +assert.commandWorked( + replTest.getPrimary().adminCommand({setParameter: 1, mirrorReads: {samplingRate: 1.0}})); + +for (var i = 0; i < 100; ++i) { + primaryTestDB.runCommand({find: collName, filter: {}}); +} +jsTestLog("Mid test"); +replTest.nodes[2].getDB('test').runCommand({find: 'test', filter: {}}); +jsTestLog("End test"); + jsTestLog("7: Stop and restart Node 2."); replTest.stop(2); replTest.restart(2, {
    • Service Arch Prioritized List

    Description

      Currently, recovering nodes with a non-zero priority are considered electable (see here and here). This implies that nodes that are neither secondary nor primary will show up in the hosts section of hello responses.

      Mirrored reads rely on the members in the hosts section to choose mirroring targets (here), and due to the aforementioned issue, it may mirror reads to non-secondary nodes (see SERVER-60553). As part of this ticket, we should decide if:

      • Recovering nodes should not be considered electable and this is an issue with the implementation of hello command that we need to fix.
      • This is an issue with mirrored reads and we need to change the underlying mechanism that selects mirroring targets (e.g., using RSM).

      Starting with the replication team to evaluate the first option. If this is not an issue with the implementation of hello, feel free to reassign to ServiceArch.

      Attachments

        Issue Links

          Activity

            People

              backlog-server-servicearch Backlog - Service Architecture
              amirsaman.memaripour@mongodb.com Amirsaman Memaripour
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: