Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-13170

Add a --redact option to wt_binary_decode.py

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Engines

      As wt_binary_decode.py evolves, it's getting more useful to see data in .wt files. WT-13167 extends that by added BSON dumping.

      We should have an option to redact information about keys and/or data, possibly it should be the default.  When redaction is on, we should not be able to show bson dump information, or even decoded bytes.

      Two points, we should still go to the trouble of internally decoding (like call the bson dumper, when asked for), but just not print it.  That would verify that the information is not corrupted.  Second, instead of showing any bytes, we could do an MD5 hash of the bytes.  That might allow us to compare two sets of data and verify that they are the same (without revealing anything).  For example, looking at two equivalent leaf blocks from two checkpoints to see if items are inserted/deleted.

      If we had this facility, then we could ask TSEs or customers to run decode and send us results, and customer information would remain protected.

       

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            donald.anderson@mongodb.com Donald Anderson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: