Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36811

Provide a mechanism for replication to specify the 'maximum_truncation_timestamp' for a given 'stable_timestamp'

    • Fully Compatible
    • Storage NYC 2018-09-10, Storage NYC 2018-09-24

      We must ensure that the ‘prepare’ oplog entries for any transactions that are prepared at the ‘stable timestamp’ (for replication rollback) or ‘last stable checkpoint timestamp’ (for startup replication recovery) are not truncated off the oplog (or any oplog entries for active transactions in the “Transactions larger than 16MB project”). If so then nodes will not be able to re-apply these uncommitted transactions. ‘last stable checkpoint timestamp’ <= ‘stable timestamp’, so we will focus on the ‘last stable checkpoint timestamp’ case since that implicitly makes sure we have all of the oplog entries we need for the ‘stable timestamp’ or the current point in time.

      To do this, whenever replication tells storage about a new ‘stable timestamp’, it will also provide it with a ‘maximum_truncation_timestamp’. This ‘maximum_truncation_timestamp’ will be the latest timestamp that the storage engine is allowed to truncate off the back of the oplog when its accompanying ‘stable timestamp’ is the current ‘last stable checkpoint timestamp’. Replication will provide the timestamp that was the ‘oldest active transaction timestamp’ at the time of the ‘stable timestamp’. This is not the current ‘oldest active transaction timestamp’, but rather an older value of it.

            daniel.gottlieb@mongodb.com Daniel Gottlieb (Inactive)
            judah.schvimer@mongodb.com Judah Schvimer
            0 Vote for this issue
            6 Start watching this issue