This is useful for sharding huge collections and the user would want to wait for the collection to be in steady state before inserting new documents.
Note: for empty collections, the chunks should already be balanced when the shardCollection command returns successfully.
We have had a few users run into problems with querying sharded collection that are being actively balanced.
With hashed shard keys, the "shardCollection" command will create chunks and distribute them across the shards. Normally, the migration of empty chunks takes little time. We have seen cases where the shards are so overloaded that this migration does not complete before the user application starts inserting documents. Therefore, the application is actively inserting documents into a collection that is being migrated. We have seen that the shards take days or longer to finally balance.
This situation can lead to strange problems like:
- failures returned to the client when shard metadata is stale.
- complete collection balancing never really being achieved
This situation is made worse by the tendency of some applications to keep creating a logical set of collections, either using a different name or creating new databases. We are not entirely sure why users want to partition a single logical data set into many collections (of the same structure) but this behavior is certainly not unusual.
Unfortunately, these users often delay optimizing or upgrading their cluster to reduce the load.
To assist these users, I suggest that we add methods, callable from client applications,
- to test whether the balancing of a collection is complete (or as complete as it will get).
- extend the "shardCollection" command adding a boolean argument ( e.g. "waitForBalancing") to block until the migration of the empty chunks has completed
With these, the clients can create new collections, wait for the balancing of the empty chunks, then proceed with inserting documents.