We may need to improve documentation as log corruption detection has subtly changed operations around restarts. I'm also opening this ticket simply to raise awareness about the change.
There is a certain kind of log corruption that should never happen with an application crash. Although a log thread may do a partial write of its record, the first part of the record should always be written before later parts. On recovery, we now have a way to (usually) tell if a write has completed, and if so, we can use the checksum to detect corruption. That is, if the checksum does not match, and the write was not a partial write, we know there is file corruption.
However, on POSIX, at least, there are few guarantees about write ordering, and we don't use fsync to enforce ordering of the pieces of a log record. So on a system crash, we can end up with log records that have the final part written (including our hint that the write was not a partial one), and the checksum does not match, because other parts of the record never made it to disk. Yes, that does look like corruption to us now.
If we see such corruption, recovery will fail, and wiredtiger must be started with log salvaging on. This is a change in behavior, previously we allowed this corruption (it was detected and not reported, the log was truncated at the point of corruption, and recovery continued).