We should add statistics that track rare(r) history store insertion types, that will help us in debugging failures in the field. If we see some of those things in use it could give us a pointer to where the problem lies.
Statistic tracking when something is moved to the history store from the data file due to a prepared update being evicted.
Statistic tracking when a prepared update is read back into cache.
Statistic tracking when an update with an explicit tombstone is moved to the history store (i.e: stop doesn't match start of the next newer thing in the chain).
Statistics tracking when out-of-order timestamps cause fixups (in-memory, in data file and in history store).
Statistics tracking when mixed-mode timestamps cause cleanups (I think we have at least some of these).
Statistics showing when a read from the history store is from a fixed up out-of-order record (start == stop).