Apply bit encoding to "version" and "read version" in address cookie

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Fixed
    • Priority: Major - P3
    • WT12.0.0, 8.3.0-rc0
    • Affects Version/s: None
    • Component/s: Block Manager
    • None
    • Storage Engines, Storage Engines - Persistence
    • SE Persistence - 2025-08-15
    • None

      To make the address cookie more future-proof, use bit packing that allows encoding any arbitrary number (while being more efficient for smaller values).

      The proposed packing format is:

      • The encoded data is split into 4-bit chunks: F v v v.
      • If the first bit (MSB) is 0, this is the last chunk.
      • If the MSB is 1, the next chunk is a part of the same number.
      • The remaining 3 bits encode the value.
      • For chunks beyond the first, the actual value is one greater than what's decoded.
      • The sequence is LSB-first, meaning the least significant chunk comes first.
        • Rationale: simplifies encoding and decoding.
      • The low half of the byte holds the first chunk, the high half holds the second.
        • Rationale: small integers encode to their own value making debugging easier.

      For encoding signed integers, apply a zigzag-like transformation first to convert them into positive integers, then use the above bit-packing scheme to encode the result.

      Why 4-bit packing?

      • Offers a reasonable compression rate.
      • Encoding/decoding is lightweight.
      • Two small values (less than 8) fit in a single byte.

      A demo/proof-of-concept Python implementation is attached.

        1. small_int.py
          7 kB
          Yury Ershov
        2. small_int.txt
          38 kB
          Yury Ershov

              Assignee:
              Yury Ershov
              Reporter:
              Yury Ershov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: