[SERVER-9259] Don't move chunks for non-empty hashed collection during shardCollection cmd Created: 05/Apr/13  Updated: 11/Jul/16  Resolved: 12/Feb/14

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.4.1
Fix Version/s: 2.4.10, 2.5.5

Type: Bug Priority: Trivial - P5
Reporter: Randolph Tan Assignee: Randolph Tan
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Participants:

 Description   
Issue Status as of March 27, 2014

ISSUE SUMMARY
If a collection is being sharded with a hashed shard key, the initially created chunks are balanced across the shards immediately as part of the shardCollection command (independent of the balancer). The intention was to only balance empty chunks, i.e. if the collection did not contain any documents yet. This bug moves the initial chunks even if the collection already contains documents.

USER IMPACT
The immediate balancing of initial chunks has the following undesireable effects:

  • It unexpectedly moves potentially large amounts of data, which can have a noticeable impact on the cluster due to increased load on network and disks.
  • The initial chunks are moved whether the balancer is enabled or not, and regardless of a defined balancer window.
  • The shardCollection command silently waits for the initial chunk moves to finish, and may appear to be stalled during that time.

Forcefully aborting the shardCollection command can leave the meta-data of the sharded collection in an undefined state.

Increasing the number of initial chunks (with the numInitialChunks option) causes more chunks to be moved and therefore aggravates the issue.

SOLUTION
The fix is to only trigger the initial chunk move logic if the collection is empty.

WORKAROUNDS
To work around the issue, only shard collections (on a hashed shard key) if they are empty. If you need to shard a non-empty collection, keep the number of initial chunks to the default, which is 2 for each shard.

PATCHES
The fix is included in the 2.4.10 production release and the 2.5.5 development release, which evolves into the 2.6.0 production release.

Original Description

The comment in ShardCollectionCmd::run() says:

                // only initially move chunks when using a hashed shard key
                if (isHashedShardKey) {
 
                    // Reload the new config info.  If we created more than one initial chunk, then
                    // we need to move them around to balance.
                    ChunkManagerPtr chunkManager = config->getChunkManager( ns , true );
                    ChunkMap chunkMap = chunkManager->getChunkMap();
 

But it's not checking that the collection is empty before moving the chunks directly from within the command. This code block is only meant to be hit with empty collections in which case it is very fast and ensures even chunk distribution.



 Comments   
Comment by Githook User [ 09/Mar/14 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-9259 Don't move chunks for non-empty hashed collection during shardCollection cmd
Branch: v2.4
https://github.com/mongodb/mongo/commit/fcabc21a3e9336869af2598aaa7ad86012d467cb

Comment by Githook User [ 27/Jan/14 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-9259 Don't move chunks for non-empty hashed collection during shardCollection cmd
Branch: master
https://github.com/mongodb/mongo/commit/e9423c914367cf0ae89d27999e824da70864c59c

Generated at Thu Feb 08 03:19:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.