Priority: Major - P3
Affects Version/s: None
Fix Version/s: 5.2 Desired
The implementation of ContinuousTenantMigration suggests that when we pause after test, we expect the hook to be in a state in which no migrations are going on. This can be violated.
Suppose this sequence of events takes place:
- The tenant migrations thread is started, and it pauses here before self._is_idle_evt.clear(). It has already checked to make sure a tenant migration is permitted.
- The main resmoke thread of execution is done with the test and attempts to pause the thread. Marking the test as finished in pause() is irrelevant now, since the tenant migrations thread has already run past the wait_for_tenant_migration_permitted().
- Since the tenant migrations thread has not performed self._is_idle_evt.clear() yet, this check in pause() succeeds, and we think we have finished pausing the tenant migrations thread.
- However, the tenant migrations thread is free to proceed and does not know it should pause.
There is a sequence of steps in which stop() comes into play once all tests have been completed, which prevents the tenant migration thread from ever terminating.