Add OSI causality barrier on resharding coordinator step-up

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Fixed
    • Priority: Major - P3
    • 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • Fully Compatible
    • ClusterScalability 30Mar-13Apr, ClusterScalability 13Apr-27Apr
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The resharding coordinator uses OSI replay protection for initialize commands but does not perform a noop write on step-up, so participants may still accept in-flight initialize commands from an old primary.

      When a coordinator steps up and recovers a state beyond kInitializing, it should bump and persist the OSI, then send a noop retryable write to all participants(also known as causality barrier) and wait for majority. This ensures participants reject any stale initialize/abort commands from a rogue primary.

            Assignee:
            Kruti Shah
            Reporter:
            Kruti Shah
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: