In sync_tail.cc, multiApply() assumes the application always succeeds, then sets minValid to acknowledge that.
multiApply() delegates the work to applyOps(), which simply schedules the work to worker threads:
However schedule() may return an error to indicate shutdown is already in progress. sync_tail.cpp ignores the error and continues to mark that operation finished.
If the shutdown happens after the schedule of operations, the secondary will run into another fassert, which is also unexpected. Restart cannot fix the inconsistent state either. This has also been observed in repeated runs of backup_restore.js
As a result, any kind of operations may be marked executed by mistake when shutting down the secondary, including commands and database operations, leading to an inconsistent state with the primary and potential missing/stale documents on secondaries.
To fix this issue, after the on_block_exit of the join call we need to check if shutdown is happened and return the empty optime to indicate the batch is not complete.