[SERVER-15097] distributed lock can get stuck at state 1 after bad connections Created: 29/Aug/14  Updated: 26/Sep/17  Resolved: 01/Jul/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.7.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Randolph Tan
Resolution: Done Votes: 0
Labels: DistLock
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2 04/24/15, Sharding 3 05/15/15, Sharding 4 06/05/15, Sharding 5 06/26/16, Sharding 8 08/28/15, Sharding 9 (09/18/15), Sharding 16 (06/24/16), Sharding 17 (07/15/16)
Participants:
Case:

 Description   

If a thread sets the state to 1* but aborted (with socket exceptions) before it fully claims the lock, it can block all other threads from acquiring the lock as long as it keeps on updating the lock ping.

* For this issue to happen, the other threads who were also able to successfully modify the state to 1 should have a lower ts than the thread who aborted.



 Comments   
Comment by Randolph Tan [ 01/Jul/16 ]

Issue is exclusive to SCCC which is now gone in v3.4.

Comment by Randolph Tan [ 17/Sep/15 ]

Maybe, and it will require additional manual work as the code has undergone some refactoring in the master branch.

Comment by Andy Schwerin [ 17/Sep/15 ]

Would this warrant backport to 3.0?

Comment by Randolph Tan [ 15/Jul/15 ]

Note that this only affects distributed lock with legacy 3 config servers.

Generated at Thu Feb 08 03:36:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.