Affects Version/s: None
There is a test/format LSM job that got stuck. The configuration file is:
The LSM tree has 20 active chunks. Of those chunks 5 are flushed, the rest are all in memory. The non-flushed chunks are filling the cache.
There are 4 LSM worker threads, one of which is the manager. One thread can only do switch and drop operations (that thread is idle), one of which is currently doing a merge, but stuck with cache full:
One of which is creating a bloom filter, and is stuck waiting for the cache to get less full:
I think creating bloom filters doesn't expect to get stuck waiting for space in the cache. In
Alternatively we could fiddle with the LSM worker thread work unit assignments, so that the thread that only does switches and drops (very short lived operations) could do flushes as well if we've stopped making progress. The difficulty would be in determining when we are and aren't making progress.