Fix deadlock with range deletion task registration

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • 8.3.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • Fully Compatible
    • ALL
    • ClusterScalability 2Feb-16Feb
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The deadlock is caused by how the registrationTime for a task enqueued into _rangeDeletionTasks can actually be different from the registrationTime captured by the lambda for that same task when the task doesn't have a time set (as both will just use the current time when non are provided). This then breaks the symmetry of this comparison, making it possible for two tasks to wait on each other.

      To make this more concrete consider the following example:

      1. Register task A [10, 30] without an explicit timestamp.
      2. Then we register task B [5, 15] also without an explicit timestamp.
      3. When we decide ordering/dependencies, we compare the captured registrationTime against the queued one:
        • Task B ends waits on task A because its lambda captured registration time of 102 is > than the registrationTime on the entry for task A of 101.
        • Task A waits on task B because its lambda captured registration time of 101 = the registrationTime on the entry for task B of 101 and its taskID can be smaller than taskB
        • (see comparison logic here)

            Assignee:
            Wenqin Ye
            Reporter:
            Wenqin Ye
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: