FCBIS can lead to incorrect fast count on destination

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Won't Fix
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Replication
    • ALL
    • v8.0, v7.3, v7.0, v6.0, v5.0
    • Hide

      See comments for how to repro.

      Show
      See comments for how to repro.
    • 200
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      File copy based initial sync (FCBIS) copies the sizeStorer.wt file, whose value is updated when FCBIS opens a backup cursor on the source node which causes the source node to flush the in-memory counter WiredTigerSizeStorer maintains to disk.

      However, because this in-memory value can be rolled back when the corresponding storage transaction rolls back, now we have a value that has been persisted to disk and then copied over to the recipient.

      As a result, any time the storage txn rolls back on the source node, the destination node will not know about it and therefore will have an incorrect value in its sizeStorer.wt file.

      And from that point onwards fast count on the destination will be wrong.

      As others have pointed out, in many ways FCBIS is very similar to unclean shutdown. In that case we may want to make it clear (via documentation) that fast count is not valid during FCBIS if we haven't already.

      Note: See my most recent comment

              Assignee:
              [DO NOT USE] Backlog - Replication Team
              Reporter:
              Vishnu Kaushik
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

                Created:
                Updated:
                Resolved: