Upgradable structure packing format

    • Type: Technical Debt
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Engines
    • None
    • None

      The goal of this ticket is to design a strategy for evolving binary formats in WiredTiger while preserving cross-version compatibility and ensuring robust, error-tolerant handling.

      1. Motivation

      WiredTiger currently delays or avoids implementing features that require binary format changes due to backward/forward compatibility concerns. As a result, promising enhancements are postponed indefinitely.

      Two main problems arise when binary formats change:

      1. Files written in newer formats become unreadable by older versions, leading to crashes if they are attempted to be read.
      2. Even if older versions can parse the data, correctness may not be guaranteed (e.g., they might read but not write, or misinterpret features).

      Some minor format changes have been possible by exploiting specific access patterns (e.g., appending to checkpoint data), but a more systematic and safe approach is needed.

      2. Goals

      Define a strategy for evolving large on-disk binary structures in WiredTiger (e.g., file headers, page headers, checkpoint metadata) that ensures:

      1. Safe reading by both newer and older WiredTiger versions, without crashes.
      2. Ability to detect the version of WiredTiger that created the data or the set of features it includes.
      3. Ability for a version of WiredTiger to determine whether it can read or write a given file of a particular version.
      4. Acceptable performance, though not necessarily identical to current fastest methods.

      NON-Goal: This strategy will NOT apply to performance-critical or frequently accessed structures (e.g., items within a leaf page).

      3. Expected Deliverables

      • A documented format or schema supporting feature/version detection.
      • Guidelines for modifying binary layouts. List the structures affected by this change.
      • Compatibility matrix or validation logic to identify readable/writable states. This needs to be maintained in future.
      • Prototype implementation or simulation for evaluation.

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Yury Ershov
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: