Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-49187

Make ReplSetTest .stepUp() robust to election failures.

    • Fully Compatible
    • ALL
    • v4.4, v4.2, v4.0
    • Repl 2020-07-13, Repl 2020-07-27, Repl 2020-08-10
    • 100

      ReplSetTest .stepUp() retires again if replSetStepUp cmd fails. Retries results in calling awaitReplication() again which requires a primary to be present for the replica set. And, we can't hold that guarantee (a primary will be present for every retries) if a test run with high election timeout (24 hrs).

      Consider a scenario where a primary stepped down during the first failed attempt of replSetStepUp cmd. So, the replica set won't be having a primary going forward since the test runs with high election timeout. This would result the retry awaitReplication() step to be stuck waiting for the primary.

      Previously, when reconstruct_prepared_transactions_initial_sync.js had a similar issue, I fixed it by making the jstest to use stepUpNoAwaitReplication  instead of ReplSetTest .stepUp() (see SERVER-48778). Now, retryable_commit_transaction_after_failover.js also failed due to the above mentioned issue. Revalidating SERVER-48778 fix makes me to realize that these steps are not necessary to be retried on failure of replSetStepUp cmd. Since replSetStepUp  cmd is wrapped up in the assert.soon(), we really don't need to call those steps on every retries.

            suganthi.mani@mongodb.com Suganthi Mani
            suganthi.mani@mongodb.com Suganthi Mani
            0 Vote for this issue
            4 Start watching this issue