Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-82883

Recovering TransactionCoordinator on stepup may block acquiring read/write tickets while participants are in the prepared state

    • Cluster Scalability
    • Fully Compatible
    • ALL
    • v7.0, v6.0, v5.0, v4.4
    • Cluster Scalability 2023-11-27, Cluster Scalability 2023-12-11, Cluster Scalability 2023-12-25
    • 155

      Consider a TransactionCoordinator that has sent the prepare command to the participants and then crashes. The new primary, on stepup, will resume the coordination. There are several points at which this can stall behind a read/write ticket acquisition. This is undesirable, both for performance and because it can cause deadlocks.

      Ticket acquisitions occur at:
      (1) When TransactionCoordinatorService::onStepUp calls replClientInfo.setLastOpToSystemLastOpTime, which takes the GlobalLock in MODE_IX.
      (2) When TransactionCoordinatorService::onStepUp reads config.transaction_coordinators.
      (3) When waiting for durable VectorClock. This sometimes results in a write (the first time after stepup, or upon topology changes).
      (4) When (re-)persisting the participants list. Note that even though it had already been persisted, if the coordinator had not persisted the decision yet, on recovery we will persist again the participant list. As a separate improvement. we should also consider not doing this write again.

      SERVER-60682 made persisting the decision skip ticket acquisition, but did not address these other situations that occur on recovery.

      In addition to not skipping ticket acquisition, (1) and (3) do not skip FlowControl either.

            wenqin.ye@mongodb.com Wenqin Ye
            jordi.serra-torrens@mongodb.com Jordi Serra Torrens
            0 Vote for this issue
            11 Start watching this issue