-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
Labels:None
-
Fully Compatible
-
Repl 2019-12-16
-
0
The fix for SERVER-44809 was wrong (and this ticket reverts that ticket). It appears what actually happens is the destruction of the lambda for the allDatabaseCloner runs asynchronously, on the cloner executor, after the future is made ready. This can result in execution of the finishCallback for the onCompletion guard being run on the cloner executor after we enter net->runUntil(), which results in the next attempt being scheduled too late.
Destroying the onCompletion shared pointer in the lambda while holding the initial syncer mutex ensures the final destruction happens somewhere else, since at that point we know there are other references to the shared pointer (except in the shutdown case)