Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-91221

Catalog and Routing: Audit feature flag checks for unsafe races with setFCV

    • Type: Icon: Task Task
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Catalog and Routing
    • v8.0
    • CAR Team 2024-06-10, CAR Team 2024-06-24, CAR Team 2024-07-08

      Feature flag checks can race with the setFCV command, causing undesirable behavior during an upgrade or downgrade. For each feature flag with shouldBeFCVGated: true enabled in 8.0, please double check that:

      • Each thread only checks the feature flag once
        • Risk: Otherwise the first feature flag check could return one result while a subsequent check returns a different result. This could cause the feature to partially execute in an unexpected way
      • If the feature affects what could be written to disk, that we are holding the global lock in a mode conflicting with S when doing feature flag checks
        • Risk: Otherwise there is a race condition possible where the feature flag check returns one result and then the setFCV operation fully completes (either an upgrade or a downgrade). This would mean that the original operation could execute in an FCV that should not be allowed
      • If the feature affects what could be written to disk, we do not execute feature flag checks on secondaries performing oplog application. The oplog entry itself should tell the secondary how to execute the operation.
        • Risk: If a secondary can make its own decision on what the FCV dictates the feature should be, it can race with setFCV and make a different decision than the primary, causing its data to diverge from the primary’s data.

      The FCV and Feature Flag README details these safety rules as well.

      Please audit the checks for the following feature flags
      (DO NOT RE-ORDER THE LIST BELOW, PLEASE!)

      1. featureFlagBanEncryptionOptionsInCollectionCreation
      2. featureFlagDisallowBucketCollectionWithoutTimeseriesOptions
      3. featureFlagDefaultReadMaxTimeMS
      4. featureFlagIndexBuildGracefulErrorHandling
      5. featureFlagAuthoritativeRefineCollectionShardKey
      6. featureFlagClusterFsyncLock
      7. featureFlagAuthoritativeShardCollection
      8. featureFlagOneChunkPerShardEmptyCollectionWithHashedShardKey
      9. featureFlagBalancerSettingsSchema
      10. featureFlagPlacementHistoryPostFCV73
      11. featureFlagShardedAggregationCatalogCacheGossiping
      12. featureFlagConvertToCappedCoordinator
      13. featureFlagTrackUnshardedCollectionsUponMoveCollection
      14. featureFlag80CollectionCreationPath
      15. featureFlagFailOnDirectShardOperations
      16. featureFlagEnforceRoutingByNamespace
      17. featureFlagStopDDLCoordinatorsDuringTopologyChanges

            Assignee:
            yuhong.zhang@mongodb.com Yuhong Zhang
            Reporter:
            samy.lanka@mongodb.com Samyukta Lanka
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: