-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Engines
-
StorEng - Defined Pipeline
-
None
keith.bostic@mongodb.com asked a series of interesting questions about how the code that manages whether a particular transaction has read and/or durable timestamps shared for global transaction states to include. The answers to these questions seemed nuanced enough that I wanted to capture them permanently in JIRA - hence the ticket.
The questions were:
I have a question about this code, if you have a second:
/*
* __wt_txn_clear_read_timestamp --
* Clear a transaction's published read timestamp.
*/
void
__wt_txn_clear_read_timestamp(WT_SESSION_IMPL *session)
{
WT_TXN *txn;
WT_TXN_SHARED *txn_shared;
txn = session->txn;
txn_shared = WT_SESSION_TXN_SHARED(session);
if (F_ISSET(txn, WT_TXN_SHARED_TS_READ)) {
/* Assert the read timestamp is greater than or equal to the pinned timestamp. */
WT_ASSERT(session, txn_shared->read_timestamp >= S2C(session)->txn_global.pinned_timestamp);
WT_WRITE_BARRIER();
F_CLR(txn, WT_TXN_SHARED_TS_READ);
}
txn_shared->read_timestamp = WT_TS_NONE;
}
First question: Is the purpose of the WT_TXN_SHARED_TS_READ flag to indicate whether or not the read-timestamp is set?
Second question: If the answer to the first question is “yes”, then what is the point of the write barrier? Shouldn’t this be written as:
F_CLR(txn, WT_TXN_SHARED_TS_READ); WT_PUBLISH(txn_shared->read_timestamp, WT_TS_NONE);
That is, ensure the flag is cleared before the read-TS is set to 0?
Third question, __txn_assert_after_reads() doesn’t check WT_TXN_SHARED_TS_READ, although it does check shared-read-TS == 0. Is that correct?
Fourth question: now that a TS of 0 is out-of-bounds, it’s possible the flag is no longer needed? And, I should note the durable-TS has similar issues.
Fifth question: is there a simple statement of what the global rwlock is supposed to guarantee? For example, both __txn_assert_after_reads() and __wt_txn_set_read_timestamp() acquire it and my suspicion is they don’t need to. Given the heavy update pattern of MDB server of the oldest/stable timestamps, avoiding that lock when setting a read timestamp, which is also a common operation, is probably worth doing.
Definition of done: please make sure this is appropriately documented (outside of Jira)