Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-61066

Make shardsvr DDL commands check primary status after marking opCtx as interruptible

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.0.5
    • Affects Version/s: 5.0.0
    • Component/s: Sharding
    • Labels:
    • Fully Compatible
    • 25

      SERVER-58246 outlines a race condition where a command that was marked as never allowed on secondaries and later set it's opCtx as interruptible on stepdown, may actually run on a now secondary uninterrupted. In SERVER-58246 it was decided that it was not feasible to prevent this at the command infrastructure layer.

      This ticket is to prevent this race from happening on legacy (pre-5.0) DDL operations. Since the legacy DDL is not network-partition tolerant, a stepped-down former primary running DDL concurrently with a new primary may cause harm. Interrupting the DDL as soon as a node realizes is no longer primary mitigates this situation, although it doesn't prevent from happening it in the actual network-partition scenario.

      On FCV 5.0, since the new DDL coordinators are tolerant to split brain scenarios, this is not required for correctness.

            jordi.serra-torrens@mongodb.com Jordi Serra Torrens
            jordi.serra-torrens@mongodb.com Jordi Serra Torrens
            0 Vote for this issue
            4 Start watching this issue