Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30459

shardCollection should fail if running in a mixed-FCV cluster

    • Type: Icon: Task Task
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.5.10
    • Component/s: Sharding
    • Labels:
      None
    • Sharding 2017-08-21, Sharding 2017-09-11

      A mixed-FCV cluster can occur if setFCV fails partway.

      setFCV updates the FCV's in this order:
      1) shards' FCVs
      2) config server's FCV

      So, if setFCV for 3.4 -> 3.6 fails partway, some shards may have FCV=3.6 while the config server has FCV=3.4.

      In this case, since the config server is in FCV=3.4, shardCollection will not ask the primary shard for a UUID. However, if the primary shard is in FCV=3.6, it will already have a UUID. If setFCV is later called again to resume the upgrade, the config server will generate a (different) UUID for the collection that was just sharded.

      Since there is no way for the config server to know whether the primary shard was upgraded as part of the previous failed setFCV (the setFCV call from the previous attempt may still be in flight, and may race with, say, a currentOp sent by shardCollection to check if setFCV is currently running on the shard), we should prevent running shardCollection in a mixed-FCV cluster.

      Note: this also needs to prevent shardCollection from a 3.4 mongos from running in a mixed-version cluster, maybe by preventing writes to config.collections through an OpObserver?

            Assignee:
            esha.maharishi@mongodb.com Esha Maharishi (Inactive)
            Reporter:
            esha.maharishi@mongodb.com Esha Maharishi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: