Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-72622

Resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries.

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • None
    • ALL
    • Server Serverless 2023-02-06, Server Serverless 2023-02-20, Server Serverless 2023-03-06, Server Serverless 2023-03-20, Server Serverless 2023-04-03, Server Serverless 2023-04-17, Server Serverless 2023-05-01
    • 173

    Description

      Tenant oplog applier first apply writes and then write no-ops for each donor oplog entries in a given oplog batch and then these no-ops are written in parallel using the writerpool threads. And to calculate the resume point on recipient failover, we traverse backwards through the oplog collection and find the most recent no-op oplog entry from the current migration. Due to this code logic, resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries. The consequence of this would be
      1) we might miss updating the session entries in config.transactions table for multi-statement replica set transactions, leading to duplicate transaction commit
      2) Missing oplog chain for retryable writes
      3) Change streams might miss generating the change event.

      Attachments

        Activity

          People

            christopher.caplinger@mongodb.com Christopher Caplinger
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: