Priority: Minor - P4
Affects Version/s: None
Fix Version/s: WT2.7.0
At src/os_posix/os_mtx_rw.c, line 84 (__wt_try_readlock) is this line:
new = (pad << 48) + ((users + 1) << 32) + ((users + 1) << 16) + writers;
When users == 64K-1, this code is intended to wrap around, with users set to 0. But this code does something unintended:
new = (pad << 48) + (0x10000 << 32) + (0x10000 << 16) + writers;
l->s.pad = 1; /* wrong, should be 0 */
l->s.users = 1; /* wrong, should be 0 */
l->s.readers = 0;
l->s.writers = writers;
This would be a big problem: in the 'ticket' algorithm, the users field is the next ticket number available. If another participant enters the system at this time and gets number 1, it will never be called because there is never any participant with number 0 that will increment writers to that number. Thus a wait forever for a lock that's actually available.
All that said, it so happens that _wt_try_readlock is currently not used in WT. _wt_try_writelock has similar code, but the problem is benign since in that case, only the pad field is falsely incremented. I've confirmed this by asserting that pad == 0; it fails for a test case that opens a cursor 64K times.