If a collection is being sharded with a hashed shard key, the initially created chunks are balanced across the shards immediately as part of the shardCollection command (independent of the balancer). The intention was to only balance empty chunks, i.e. if the collection did not contain any documents yet. This bug moves the initial chunks even if the collection already contains documents.
The immediate balancing of initial chunks has the following undesireable effects:
- It unexpectedly moves potentially large amounts of data, which can have a noticeable impact on the cluster due to increased load on network and disks.
- The initial chunks are moved whether the balancer is enabled or not, and regardless of a defined balancer window.
- The shardCollection command silently waits for the initial chunk moves to finish, and may appear to be stalled during that time.
Forcefully aborting the shardCollection command can leave the meta-data of the sharded collection in an undefined state.
Increasing the number of initial chunks (with the numInitialChunks option) causes more chunks to be moved and therefore aggravates the issue.
The fix is to only trigger the initial chunk move logic if the collection is empty.
To work around the issue, only shard collections (on a hashed shard key) if they are empty. If you need to shard a non-empty collection, keep the number of initial chunks to the default, which is 2 for each shard.
The fix is included in the 2.4.10 production release and the 2.5.5 development release, which evolves into the 2.6.0 production release.