Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: WT2.9.2, 3.2.13, 3.4.3, 3.5.4
Affects Version/s: WT2.9.1
Component/s: None
Labels:
None

Sprint:
Storage 2017-02-13
Story Points:
None

Branch wt-2909-verify-checkpoint-integrity introduces a test that runs a subprogram that does some inserts and periodically checkpoints. During the course of a checkpoint, we cause some file system writes to fail, and we expect the subprogram to fail. The parent program opens a connection to the (failed) home directory and reads what it can.

The subprogram inserts into two tables within a single transaction. In the case of the failure, we see one of the tables containing many records, and the other only containing 1. (we always do a checkpoint after the 1st record). The test always expects to see the same number of records in each. Note that there is a long comment in test/csuite/wt2909_checkpoint_integrity/main.c describing the test.

There is a caveat to this JIRA report. We must be sure that there is not an error in the fail_fs code that violates some assumption of the file system code. In particular, fail_fs does not do locks or unlocks of files, or does syncs. That is because fail_fs does not need to be durable in the face of system crashes, only for process crashes. Perhaps I missed some other assumption.

To see the failure:

cd build_posix/test/csuite;
./test_wt2909_checkpoint_integrity -v -o 125

That runs the "top level" test as well as the subtest. To run the subtest only, which populates and uses the fail_fs to inject failures, do:

./test_wt2909_checkpoint_integrity subtest -v -p -o 125 -n 50000

At the moment, I've only verified this is a failure on OS/X. It's consistently reproducible.

For a stack trace of where the write fault was injected, see WT_TEST.subtest/stdout.txt.

Assignee:: Susan LoVerso (Inactive)
Reporter:: Donald Anderson
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Jan 29 2017 09:52:56 PM UTC
Updated:: Oct 12 2017 11:11:51 PM UTC
Resolved:: Feb 06 2017 04:32:38 PM UTC

Details

Description

Attachments

Forms

Activity

People

Dates