Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9313

The meaning of "cluster-level consensus on durability" in prepared transactions with timestamps

    • Type: Icon: Documentation Documentation
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: WT Docs
    • 0
    • Storage - Ra 2022-06-13, Storage Engines - 2022-06-27

      When I read the documentation of prepared transactions with timestamps, I am confused about the meaning of "cluster-level consensus on durability".

      Using transaction prepare with timestamps

      See WiredTiger: Using transaction prepare with timestamps

      MongoDB specifies different commit and durable timestamps because prepared transactions are higher-level MongoDB operations, requiring cluster-level consensus on durability. Applications without similar requirements for prepared transactions should set the durable and commit timestamps to the same time.

      As I know in MongoDB, the commit timestamp and the durable timestamp are the same if it commits only in a shard and without preparation.

      But in 2PC, even though the commit timestamp is the same for each shard, they all set their latest cluster timestamp to the durable timestamp when committing. In other words, for a distributed transaction, they have same commit timestamp but different durable timestamp. 

      From the documentation, I see that this is for cluster-level consensus on durability. But the problem is that the durable timestamp is not the same, so what is the meaning of cluster-level consensus? It is about checkpoint or rollback safety?  In the high-level design, what if we think of a 2PC transaction commits with commit timestamp equals durable timestamp.

      In addition, I noticed that there are gaps between commit timestamp and durable timestamp in a 2PC transaction, and someone has already noticed this and brought it up for discussion in https://jira.mongodb.org/browse/WT-8747 and https://jira.mongodb.org/browse/SERVER-63322. So does the `all_durable` timestamp , or MongoDB prevent this anomaly? 

      As keith.bostic@mongodb.com mentioned in SERVER-63322, there is another data consistency corner case. So is there any example of this corner case? 

       

      It's a durability problem we're reasonably confident isn't a real problem in MongoDB server. We'd like to get it out of the way to eliminate another data consistency corner case, but we don't believe there are actual problems in the server.

       

       

       

            Assignee:
            will.korteland@mongodb.com Will Korteland
            Reporter:
            tsunaouyang@gmail.com Ouyang Tsuna
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: