Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-77864

Make the use of the AlternativeClientRegion for replay protection implicit

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Catalog and Routing
    • 3

      In order to guarantee replay protection many command such as ShardsvrParticipantBlock will run within a
      retryable write. Any local transaction or retryable write spawned by this command
      (such as the release of the critical section) using the original operation context
      will cause a dead lock due to the session used that has been already checked-out. We prevent the issue by using a new operation context with an empty session and later by issuing a dummy write to  durably persist the session in the oplog

      This is ok since the only reason we use a retryable write is indeed to guarantee replay protection, which is a situation in which a DDL is retried by a node that is no longer the primary (this can happen because of a network partition). By storing session information in the oplog, any request coming with a txnNumber lower then the persisted one will result in an error, preventing the old primary to successfully commit.

      At the moment there is no better solution unless we re-think a way to guarantee replay protection without forcing those commands to run within a retryable write. 

      The goal of the ticket is to simplify the readability of and re-usage of the boilerplate code that prevents the deadlock. We could do this by hiding the code in a RAII object helper class or by executing it from the entry point (this might require some design).

            Assignee:
            backlog-server-catalog-and-routing [DO NOT USE] Backlog - Catalog and Routing
            Reporter:
            enrico.golfieri@mongodb.com Enrico Golfieri
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: