Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-92437

Problems with resharding operation that starts while the FCV is "8.0" or "7.3" and commits while the FCV is "Downgrading to 7.0"

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Cluster Scalability
    • ALL
    • Cluster Scalability Priorities

      The setFCV command has a step to abort in-progress resharding operations both on upgrade and downgrade. However, this step comes after the FCV has been set to "Downgrading to X.Y" or "Upgrading to X.Y". This means that a resharding operation can start while the FCV is "8.0" or "7.3" and commit while the FCV is "Downgrading to 7.0". The former supports gFeatureFlagReshardingImprovements, whereas the latter does not. This can lead to the following issues:

      1. While the FCV is "8.0" or "7.3", the user runs a reshardCollection collection.The recipient shards skip creating the indexes for the collection since the gFeatureFlagReshardingImprovements.isEnabled(serverGlobalParams.featureCompatibility.acquireFCVSnapshot()) check here returns true.
      2. The recipient shards finishes cloning.
      3. The user runs setFCV to downgrade the FCV to 7.0. The config server sets the FCV on itself and all the shards to "Downgrading to 7.0".
      4. The recipient shards transition from the "cloning" state to the "apply" state instead of the "building-index" state since the gFeatureFlagReshardingImprovements.isEnabled(serverGlobalParams.featureCompatibility.acquireFCVSnapshot()) check here returns false.
      5. The resharding operation commits. The resulting collection doesn't have any of the indexes in the original collection before resharding.
      6. The setFCV command on config server gets to the step to abort resharding operations but there are no in-progress resharding operations.

      We should look for a more general solution, i.e. not just fix this for gFeatureFlagReshardingImprovements, since we will likely hit this kind of issues again when we introduce other resharding feature flag (e.g. in SPM-3667). 

            Assignee:
            Unassigned Unassigned
            Reporter:
            cheahuychou.mao@mongodb.com Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: