The current chunk clone logic (as part of moveChunk), ping-pongs serially between the donor fetching chunk documents and the recipient applying them. Since both sequences are potentially slow and might be bringing cold data from cache, there could be benefit in overlapping them.
By putting this loop in a separate thread we can overlap that sequence and possibly double the clone speed.
This can be implemented by spawning a ThreadPool at the beginning of the clone phase and putting a fixed-size BlockingQueue (similar to the JournalWriter) between the migrate driver thread and the worker thread.
It is possible that such aggressive clone might not be appropriate for the MMAP V1 storage engine, so this optimization should only be done for WT.