Potential distributed lock forcing inconsistency

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • 1.9.0
    • Affects Version/s: None
    • Component/s: Concurrency, Sharding
    • None
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      When a distributed lock is forced:

      from distlock.cpp>

      log() << "dist_lock forcefully taking over from: " << o << " elapsed minutes: " << elapsed << endl;
      conn->update( _ns , _id , BSON( "$set" << BSON( "state" << 0 ) ) );

      the update takes into account only the name of the lock, not it's current state or the unique ts value associated with every active lock entry. If processes interleave such that:

      Process 0 crashes with lock
      Process 1 detects forcing required
      Process 2 detects forcing required
      Process 1 forces Process 0 lock by name, creates and acquires new lock with same name
      Process 2 forces Process 1 lock by name, which is bad because Process 1 is still using that lock

      Fix: Check for ts value and state

      conn->update( _ns , BSON( "_id" << _id["_id"].String() << "state" << o["state"].numberInt() << "ts" << o["ts"] ), BSON( "$set" << BSON( "state" << 0 ) ) );

      Difficult to reproduce without new test cases which modify timeout.

            Assignee:
            Greg Studer (Inactive)
            Reporter:
            Greg Studer (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: