We believe we have run into a potential snapshot corruption issue in 2.8.0, when running compaction concurrently with writes. Minimal repro case is attached below, so would appreciate comments if there’s perhaps a problem with what we’re doing. What appears to happen is that default snapshot created during compaction has a table and its index out of sync with each other (records that exist in one are not in the other, and vice versa). Calling session.verify() does not report any problems, however explicitly comparing records in index to the table uncovers large number of mismatches.
We have not yet been able to fully confirm that only snapshot is corrupted, or if the state of BTree is affected as well. But early indication is that only snapshot is impacted, because explicitly calling checkpoint() after compact() gets the table and index into consistent state again.
Here’s the minimal repro case (using java APIs, but I don't think that's relevant here). The case goes like this:
· Open connection and create a table with one index
· Verify contents of the table against content of the index. (Note, there are no mutations at this time, so expect both to match perfectly).
· Start one writer thread (90/10 ratio of adds to deletes).
· Sleep 10 seconds
· Run compaction on both table and index
· Sleep 10 more seconds
When run first time get expected “0 corrupted records out of 0”. When run the second time (note a snapshot now exists, created by compaction) get the following error “6598 corrupted records out of 1138935”