Concurrent createIndex+collMod can leave behind inconsistent collection options

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: 6.0.0, 7.0.0, 8.0.0, 8.2.0-rc0, 8.1.0
    • Component/s: None
    • Catalog and Routing
    • ALL
    • 0
    • None
    • 3
    • TBD
    • 馃煡 DDL
    • None
    • None
    • None
    • None
    • None
    • None

      On sharded clusters, the collMod DDL coordinator first sends the collMod participant command to the primary shard, then broadcast it to all other shards. If the responses to the broadcast are retriable errors, the coordinator will retry, and eventually collMod will complete successfully (see: SERVER-90117).

      However, if one of the responses is a non-retriable error, the coordinator will not retry, but rather give up and fail. Since the collMod was executed on the primary shard but not the other shards, the collection options became inconsistent across shards, as reported by checkMetadataInconsistency.

      One instance where this happens is when an index build is in progress in a non-primary shard. That shard will return a BackgroundOperationInProgressForNamespace error, which is not retriable, and collMod will fail midway leaving the collection options inconsistent across shards (see attached repro).

              Assignee:
              Unassigned
              Reporter:
              Joan Bruguera Mic贸
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: