[SERVER-37717] Race between Baton::notify() and Waitable::wait() Created: 23/Oct/18 Updated: 29/Oct/23 Resolved: 17/Nov/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Networking |
| Affects Version/s: | 4.0.0, 4.1.4 |
| Fix Version/s: | 4.0.5, 4.1.6 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Mathias Stearn | Assignee: | Mathias Stearn |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Backport Requested: |
v4.0
|
||||
| Sprint: | Service Arch 2018-11-05, Service Arch 2018-11-19 | ||||
| Participants: | |||||
| Description |
|
If notify() is called after the Baton's run() method does its final check of the queue and the point where wait() relocks the mutex, it will push an empty task to the queue which won't have a chance to run before the Baton is detached from the OperationContext. This causes an invariant failure when detach() asserts that the queue is empty. This race also exists in 4.0 with a larger window: any time the operation is killed between the last call to run() and the call to detach(). The fix for both branches will be to not schedule the empty task on killop and instead just always tap the event fd. Until this is fixed, it can be worked around by setting the AsyncRequestsSenderUseBaton server parameter to false. |
| Comments |
| Comment by Githook User [ 20/Nov/18 ] |
|
Author: {'name': 'Mathias Stearn', 'email': 'mathias@10gen.com', 'username': 'RedBeard0531'}Message: |
| Comment by Githook User [ 02/Nov/18 ] |
|
Author: {'name': 'Mathias Stearn', 'email': 'mathias@10gen.com', 'username': 'RedBeard0531'}Message: |