Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9354

Clarify semantics of read and durable timestamps in a transaction

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Engines
    • StorEng - Defined Pipeline

      keith.bostic@mongodb.com asked a series of interesting questions about how the code that manages whether a particular transaction has read and/or durable timestamps shared for global transaction states to include. The answers to these questions seemed nuanced enough that I wanted to capture them permanently in JIRA - hence the ticket.

      The questions were:

      I have a question about this code, if you have a second:

      /*
       * __wt_txn_clear_read_timestamp --
       *     Clear a transaction's published read timestamp.
       */
      void
      __wt_txn_clear_read_timestamp(WT_SESSION_IMPL *session)
      {
          WT_TXN *txn;
          WT_TXN_SHARED *txn_shared;
      
          txn = session->txn;
          txn_shared = WT_SESSION_TXN_SHARED(session);
      
          if (F_ISSET(txn, WT_TXN_SHARED_TS_READ)) {
              /* Assert the read timestamp is greater than or equal to the pinned timestamp. */
              WT_ASSERT(session, txn_shared->read_timestamp >= S2C(session)->txn_global.pinned_timestamp);
      
              WT_WRITE_BARRIER();
              F_CLR(txn, WT_TXN_SHARED_TS_READ);
          }
          txn_shared->read_timestamp = WT_TS_NONE;
      }
      

      First question: Is the purpose of the WT_TXN_SHARED_TS_READ flag to indicate whether or not the read-timestamp is set?

      Second question: If the answer to the first question is “yes”, then what is the point of the write barrier? Shouldn’t this be written as:

      F_CLR(txn, WT_TXN_SHARED_TS_READ);
      WT_PUBLISH(txn_shared->read_timestamp, WT_TS_NONE);
      

      That is, ensure the flag is cleared before the read-TS is set to 0?

      Third question, __txn_assert_after_reads() doesn’t check WT_TXN_SHARED_TS_READ, although it does check shared-read-TS == 0. Is that correct?

      Fourth question: now that a TS of 0 is out-of-bounds, it’s possible the flag is no longer needed? And, I should note the durable-TS has similar issues.

      Fifth question: is there a simple statement of what the global rwlock is supposed to guarantee? For example, both __txn_assert_after_reads() and __wt_txn_set_read_timestamp() acquire it and my suspicion is they don’t need to. Given the heavy update pattern of MDB server of the oldest/stable timestamps, avoiding that lock when setting a read timestamp, which is also a common operation, is probably worth doing.

      Definition of done: please make sure this is appropriately documented (outside of Jira)

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: