Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39623

Race in rollback_via_refetch_commit_transaction.js

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.1.9
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Repl 2019-02-25
    • 10

      The test makes a ReplSetTest with three nodes: Rollback, SyncSource, and TieBreaker. It arranges for a transaction to be prepared and committed on Rollback but not replicated. Then Rollback is stepped down, SyncSource is elected, and Rollback starts syncing from SyncSource. When it tries to roll back its committed transaction it crashes, as expected.

      The test informs the ReplSetTest that Rollback is down, then calls ReplSetTest.stop() to stop the other two nodes. Sometimes, by this point, TieBreaker has also crashed, so ReplSetTest.stop() fails, because it expects to reach all up nodes.

      Why might TieBreaker crash? It has the opportunity the replicate the commit here, when Rollback is a secondary, TieBreaker can talk to it, and Rollback hasn't crashed yet:

      rollbackNode.reconnect(tiebreakerNode);

      Disabling chaining won't fix this alone - even after Rollback has stepped down to secondary, TieBreaker could still choose it as a sync source, because it hasn't yet noticed that Rollback is a secondary, and members with chaining disabled don't reconfirm that a sync source candidate is still primary before selecting it.

      Consider rewriting as a RollbackTest instead of a ReplSetTest; this may provide better control of when the TieBreaker node is allowed to replicate and from whom.

            Assignee:
            jesse@mongodb.com A. Jesse Jiryu Davis
            Reporter:
            jesse@mongodb.com A. Jesse Jiryu Davis
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: