Resumable range deleter must use timestamps

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Won't Fix
    • Priority: Major - P3
    • None
    • Affects Version/s: 4.9.0, 4.4.3
    • Component/s: Sharding
    • None
    • Sharding
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      By default, each range deletion task is scheduled on the executor at time (timestampMoveChunkFinished + 15 mins). If a stepdown happens, at stepUp all pending range deletions are re-submitted at time (timeOfResubmit + 15 mins) because the only persisted info is if a range deletion is delayed but not when it should have originally happened.

      This behavior makes it very easy to end up into problematic contexts such as:

      • If a sufficient number of range deletions are rescheduled, the collection could be under pressure (because continuously IX-locked) and the secondaries get very lagged because there is no syncrhonous wait for majority deletion and there are no timeouts between batches from different ranges.
      • If periodic stepdowns are happening (e.g. a stepdown every 10 mins), range deletions will never be performed.

            Assignee:
            [DO NOT USE] Backlog - Sharding Team
            Reporter:
            Pierlauro Sciarelli
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: