[SERVER-72650] shardCollection with hashed key does not randomly distribute chunks across the cluster Created: 09/Jan/23  Updated: 29/Oct/23  Resolved: 02/Mar/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 6.0.0, 6.2.0-rc5
Fix Version/s: 7.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Paolo Polato Assignee: Jordi Serra Torrens
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done after SERVER-74142 Remove deprecated 'initialSplitPoints... Closed
Problem/Incident
Assigned Teams:
Sharding EMEA
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding EMEA 2023-02-20, Sharding EMEA 2023-03-06
Participants:
Linked BF Score: 5

 Description   

When shardCollection is invoked on an empty collection specifying a hashed key, the DDL will perform an optimisation: it will split its key space in a number of chunks (using the numInitialChunks parameter) and then distribute them across different shards; by this way, the collection will be added to the sharding catalog as already balanced.

The distribution is not random: instead, chunks will be assigned to shards following the alphabetical order of shard IDs (except for the primary shard, that is assured to always receive at least one chunk).

Such strategy may lead to a state of data imbalance at cluster level when:

  • there is a high number of empty collections being sharded with a hashed key
  • the value of numInitialChunks is lower than the number of shards in the cluster.

One action to mitigate this effect is to shuffle the list of shard IDs instead of alphabetically sort it.



 Comments   
Comment by Githook User [ 02/Mar/23 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-72650 Make shardCollection with hashed key randomly distribute chunks across the cluster
Branch: master
https://github.com/mongodb/mongo/commit/37b6c9f3fbeef686273563a30f6a049c253358ea

Generated at Thu Feb 08 06:22:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.