RecordIds can be reused (initial sync)

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • ALL
    • Storage Execution 2025-07-21, Storage Execution 2025-08-04, Storage Execution 2025-08-18, Storage Execution 2025-09-01, Storage Execution 2025-09-15, Storage Execution 2025-09-29, Storage Execution 2025-10-13, Storage Execution 2025-10-27
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Initial sync case
      The reuse of recordIds due to restart is problematic even when the writes don't appear in the same batch.
      Let's say we have a primary -> secondary -> initial syncing node chain.
      The primary generates oplog entries:

      [
        ts: 1  -> {op: "i", _id: 1, rid: 1},
        ts: 2  -> {op: "d", _id: 1, rid: 1},
        ... <arbitrary number of oplog entries>
        // recordId reuse due to restart of primary:
        ts: 10 -> {op: "i", _id: 2, rid: 1},
      ]
      

      Initial sync starts at ts: 1, However, by the time collection cloning actually starts, the collection only contains the insert from ts: 10, i.e. the document {_id: 2} with recordId(1).

      After the collection cloning phase of initial sync has completed, we replay oplog entries. But the oplog entry at ts: 1 also writes to recordId(1), although for a different document! And then later, on encountering the delete at ts: 2, we will delete the {_id: 1} document with with recordId(1). Note that in this process, because we overwrote the "recordId -> document" B-tree, this means that the indexes for the document {_id: 2} still exist because we didn't take any extra steps to delete it. So we will have this dangling index entry that points to a non-existent recordId(1), as that was deleted at ts: 2.

      When we finally get to ts: 10, we try to insert the document {_id: 2} again. Unfortunately, that entry since it already exists in the _id index, we see a duplicate key error and then we carry on without doing any more work, in other words, we don't ever insert the document into the collection.

            Assignee:
            Ernesto Rodriguez Reina
            Reporter:
            Ernesto Rodriguez Reina
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: