[SERVER-41411] OplogTruncaterThread can self-deadlock Created: 30/May/19 Updated: 31/May/19 Resolved: 31/May/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Tess Avitabile (Inactive) | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Repl 2019-06-17 | ||||||||
| Participants: | |||||||||
| Description |
|
The OplogTruncaterThread acquires a global IX lock here. This calls WiredTigerRecordStore::reclaimOplog(), which calls WiredTigerKVEngine::getPinnedOplog(), which calls WiredTigerKVEngine::getOplogNeededForRollback(), which calls TransactionParticipant::getOldestActiveTimestamp(), which acquires a global IS lock in a different locker here. This can self-deadlock if there is a pending strong lock acquisition. |
| Comments |
| Comment by Tess Avitabile (Inactive) [ 31/May/19 ] |
|
I see, the problem is that we wait for the PBWM with no deadline. Note that sharing the locker with the OperationContext in the OplogTruncaterThread would be insufficient to solve the problem because the OplogTruncaterThread does not take the PBWM. |