[SERVER-55205] Running shouldStopIteration inline in AsyncTryUntilWithDelay can create deadlock Created: 15/Mar/21  Updated: 27/Oct/23  Resolved: 02/Apr/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jason Chan Assignee: Backlog - Service Architecture
Resolution: Gone away Votes: 0
Labels: servicearch-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Service Arch
Operating System: ALL
Participants:
Story Points: 4

 Description   

It's possible for the following to happen:

1. Service acquires mutex mx and invokes some cleanup work.
2. As part of cleanup, call setError on owned promises.
3. AsyncTryUntilWithDelay::runImpl reaches the end of an iteration and invokes shouldStopIteration inline, but shouldStopIteration also acquires mx.
4. We get a deadlock because the service is unable to finish the cleanup work while holding the mutex, but AsyncTryUntilWithDelay is stuck waiting on the mutex.



 Comments   
Comment by Matthew Saltz (Inactive) [ 02/Apr/21 ]

The revert of SERVER-54735 made this go away. When we re-do that ticket we should address this issue.

Generated at Thu Feb 08 05:35:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.