RenameCollectionCoordinator may trigger a CatalogCache tassert when the primary steps down

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 8.2.0-rc0
    • Affects Version/s: 8.2.0-rc0
    • Component/s: None
    • None
    • Catalog and Routing
    • Fully Compatible
    • ALL
    • CAR Team 2025-07-21
    • 200
    • 🟥 DDL
    • None
    • None
    • None
    • None
    • None
    • None

      The recent commit of SERVER-104888 introduced a regression in the logic of CreateCollectionCoordinator.

       
      When the featureFlagChangeStreamPreciseShardTargeting is enabled, RenameCollectionCoordinator pins the timestamp  value that will be applied to the version renamed collection upon commit, but not its new epoch; due to this, it is possible to experience the following sequence of events:

      1. The primary shard serves a renameCollection requests and successfully commits the new metadata on the config server, but steps down before reaching the end of the coordinator phase
      2. Despite 1), the catalog caches of the config server start consuming the committed version information persisted in config.collections
      3. After stepping up, the new coordinator node re-executes the commit transaction on the config server: through an upsert statement, it sets a new version on config.collection composed by the pinned timestamp value + a newly generated epoch
      4. Once the new committed metadata gets majority written, the catalog cache starts fetching it, hitting this tassert.

       

      Note: featureFlagChangeStreamPreciseShardTargeting is currently disabled.

            Assignee:
            Paolo Polato
            Reporter:
            Paolo Polato
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: