Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-54760

(4.2) Ghost timestamps can cause concurrent causal snapshot reads to not read their own writes

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.2.13
    • Affects Version/s: None
    • Component/s: Storage
    • None
    • Fully Compatible
    • ALL
    • Sharding 2021-03-08
    • 35

      We recently fixed an issue in 4.2 where the "all durable" timestamp can go backwards. However 4.2 maintains two variables (1 and 2) that each independently prevent their readers from seeing a value going backwards.

      These variables do not communicate with each other and are not synchronized in any way. Thus a read from a method that looks up against one value can see an "all durable" TS. Following that with a read on a different method can see a different "all durable" smaller than the previously observed TS (due to ghost timestamp writes [1]).

      A decreasing all_durable can be observed when a startTransaction request has an afterClusterTime and level: "snapshot".

      The afterClusterTime first waits for all earlier writes to complete. This logic compares against "variable 1".

      Then we open a WT snapshot using the "all durable" read source. This code path compares against "variable 2", breaking the guarantee necessary to read at or after the afterClusterTime.

      [1] WT's all_durable only goes backwards on the primary due to "ghost timestamp" writes. On 4.2, this is primarily from a multikey write inside of a multi-statement transaction.

            daniel.gottlieb@mongodb.com Daniel Gottlieb (Inactive)
            daniel.gottlieb@mongodb.com Daniel Gottlieb (Inactive)
            0 Vote for this issue
            4 Start watching this issue