This ticket should fix the data races in PrimaryOnlyService and simplify the interruption/shutdown process. Currently, there are two known data races:
- The list of operations is accessed in here (also shown below) without any synchronization.
- We identify running instances by looking at _running, a non-synchronized boolean that is used to guard another non-synchronized variable (i.e., _finishedNotifyFuture). Both values are set by an executor thread (here), and in a non-synchronized manner.
This ticket should also propose a solution to fix the interruption pattern for primary only services. Interrupting the operations, as done here and here is not sufficient, as it's inherently racy and another thread may create a new operation after the shutdown/stepDown thread is passed interrupting/killing the existing operations. One solution could be piggy backing on the interrupt interface (defined here) and throwing the interruption status on any attempt to create a new opCtx (e.g., by throwing from the operation observer in primary only service).