Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46487

The mongos routing for scatter/gather ops can have unbounded latency

    • Fully Compatible
    • ALL
    • v4.4, v4.2, v4.0, v3.6
    • Sharding 2020-04-20

      Issue Status as of April 2020


      The mongoS routing of scatter/gather operations iterates over the boundaries for the collection’s chunks sorted in ascending order. As the list is iterated, shards that own a chunk for the collection are discovered. If all shards in the cluster that own a chunk for the collection are discovered partway through the iteration, the iteration early-exits and the mongoS targets all of these shards.

      This logic is suboptimal if a large number of the sorted chunks are initially owned by a subset of shards. This can result in a large number of iterations needed to discover all shards. As an example, on a 2-shard cluster with a collection where the first 100k sorted chunks are owned on shard 1, the mongoS iterates over the boundaries for 100k chunks when routing a scatter/gather operation.


      On a sharded collection where a large number of the sorted chunks (e.g. 100k) are initially owned by a subset of shards, a scatter/gather operation can display increased latency (e.g. milliseconds) or increased mongoS CPU utilization.
      This chunk distribution can result from adding a shard to the cluster, as contiguous low chunk ranges get migrated into the new shard.


      Rearrange the chunks in a collection so that the initial chunks in the sorted list discover all the shards. The script below can be used to verify the number of iterations required to route a scatter/gather operation.

      var ns = 'mydb.mycoll'
      var nShards = db.getSiblingDB('config').shards.count();
      var count = 0;
      var shards = [];
      print('Iterating ' + ns + ' chunks sorted in ascending order...')
      var cursor = db.getSiblingDB('config').chunks.aggregate([{$match : {ns: ns}}, {$sort : { min : 1 }}]);
      while (cursor.hasNext() && shards.length != nShards) {
         let nextShard = cursor.next().shard;
         if (!shards.includes(nextShard)) {
            print("  discovered shard " + nextShard + " at chunk iteration " + count)
      print('Done. ' + count + ' chunk iterations to route a scatter/gather operation.')

            gregory.noma@mongodb.com Gregory Noma
            josef.ahmad@mongodb.com Josef Ahmad
            4 Vote for this issue
            29 Start watching this issue