Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21315

Make sure to cleanup state transition on failed dist lock acquisitions on legacy config servers

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.2.0-rc3
    • Affects Version/s: 3.2.0-rc2
    • Component/s: Sharding
    • None
    • Fully Compatible
    • ALL
    • Sharding C (11/20/15)
    • 0
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      This is to avoid the race as described below:

      1. Alice tries to get lock, generates a ts of TSa.
      2. Alice got descheduled.
      3. Bob tries to get lock, generates a ts of TSb. Note: TSb > TSa.
      4. Bob successfully grabs the lock sets the state to 2 ("locked") for all 3 config server.
      5. Bob is done and tries to unlock the lock.
      6. Bob finished unlocking the first config server and set the state to 0 ("unlocked")
      7. Alice comes back again, sees that lock state of first config server is 0, proceeds to attempt to acquire the lock by sending the update to set state to 1 ("transition to lock") to all 3 config server.
      8. Alice finds out that the update did not apply successfully to all config server (this is because the update query only matches the first config, who has the state at 0).
      9. Alice goes to the tournament round and compares the ts field of the lock document in all config servers.
      10. Alice sees that TSb > TSa, so she backs out of the tournament round.
      11. Bob proceeds on unlocking the locks in the 2 other config servers.
      12. The final state of the lock document ends up with:

      server1: state: 1, ts: TSa
      server2: state: 0
      server3: state: 0

      Once it ends up with this state, this lock can never be taken back again, until Alice's process stops pinging.

            Assignee:
            randolph@mongodb.com Randolph Tan
            Reporter:
            randolph@mongodb.com Randolph Tan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: