The KVCatalog is a single table used by non-MMAP storage engines that backs all collection and index metadata (along with the "Feature Tracker Document" which need not be considered here).
In the KVCatalog, each collection is a document and each index is embedded in its collection document. Currently, writes are only timestamped if there's an associated oplog entry.
Creating an index makes two writes to the catalog. The first is written before the index build starts, to insert the index entry into the catalog with a key/value of ready: false that is not replicated. The second write occurs when the index build completes. ready is set to true and an oplog entry is replicated. The first write is not timestamped while the second write is.
Consider the following sequence:
- The stable timestamp is 1. Collection "foo" is in some initial state.
- Time 2: "foo" gets a new "validation schema" (persisted in foo's document in KVCatalog).
- Time 3: A new index build starts on "foo". This update is not timestamped.
- rollback_to_stable is called on WiredTiger.
The desired state for "foo" is in the initial state before the updates at time 2 and 3. However, rollback_to_stable will restore the document to its "Time 3" state.
My understanding is it should be legal for ready: false index writes to have any timestamp >= to the last write on that document in the KVCatalog and < all futures to that document on the KVCatalog.