Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41861

stableTimestamp calculation makes incorrect assumptions about all_committed

    • Fully Compatible
    • ALL
    • v4.2
    • Execution Team 2019-07-15, Storage Engines 2019-07-01, Execution Team 2019-07-29
    • 12

      This explanation is incorrect when prepared transactions are getting committed.

      The all_committed is the (timestamp of the earliest uncommitted transaction that has a commit timestamp) - 1. For prepared transactions, until commit time the transaction isn't included in the all_committed because it is not timestamped. At commit time, the all_committed can briefly jump back to the commitTimestamp-1 between when we set the commitTimestamp on the transaction and when we actually commit the transaction.

      This invalidates the assumption that the all_committed is always "in the same term" as the commitPoint on a primary.

      This also invalidates any assumptions we've made about the all_committed always moving forward.

      There are 3 options I can think of:

      1. Change the semantic meaning of all_committed to be all_durable and use the durable timestamp rather than the commit timestamp to calculate it. This is in line with the idea of all_committed really being used to determine when oplog holes are open. michael.cahill thinks this isn't too hard and is reasonable if needed, though it does require more thought since it's a significant API change.
      2. Add a mechanism for committing a transaction with a commitTimestamp such that it is never counted in calculating all_committed and use it for any storage-transactions (including prepared mongodb transactions) that timestamp their transactions only right before commit time.
      3. Try to work around the current all_committed behavior in stableTimestamp calculation. This doesn't fix the problem of all_committed moving backwards, if in fact that's a problem in other places where we just haven't seen it.

            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: