Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9822

Create a tool to parse the contents of a corrupted block

    • Type: Icon: New Feature New Feature
    • Resolution: Fixed
    • Priority: Icon: Minor - P4 Minor - P4
    • WT11.2.0, 7.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • StorEng - Refinement Pipeline

      Summary

      Create a tool that we can use after a checksum mismatch to try to parse the incorrect data and, if it is recognizable, tell us what it is.

      Motivation

      Today when WiredTiger sees a checksum mismatch during a read, it prints a hex dump of the contents of the incorrect block and panics. The dump is almost always useless because the typical engineer has no way to figure out what that data is.

      If we could easily determine what data is in a "corrupt" block, it might help in diagnosing the underlying problem.

      • If the block unrecognizable garbage, it would rule out an error in higher level WT code, and point the finger at something going wrong in the OS or storage system below WiredTiger.
      • If the block contains recognizable WiredTiger data, it could be useful in debugging. How did that block getting written to the wrong place? Or why didn't this get overwritten with the correct data?

      Suggested Solution

      We have code in salvage that will walk through a file trying to find recognizable WiredTiger blocks. We could build on that so that after a checksum mismatch we take a section of the file around the mismatch and use the salvage code to find and print information about any recognizable blocks that overlap with the failed read.

            Assignee:
            donald.anderson@mongodb.com Donald Anderson
            Reporter:
            keith.smith@mongodb.com Keith Smith
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: