[SERVER-44341] Do not choose only first shard of all shards associated with a zone when pre-splitting during shard collection Created: 31/Oct/19  Updated: 29/Oct/23  Resolved: 13/Dec/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.2.3, 4.3.3, 4.0.15

Type: Bug Priority: Major - P3
Reporter: Janna Golden Assignee: Tommaso Tocci
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2, v4.0
Sprint: Sharding 2019-12-16
Participants:
Case:

 Description   

Currently, when pre-splitting a collection during shardCollection using existing zones, we choose the first shard associated with the zone to place a chunk on. This can cause a problem if there are many shards associated with the same zone - the balancer will still schedule migrations to balance the zones afterward.



 Comments   
Comment by Githook User [ 16/Dec/19 ]

Author:

{'name': 'Tommaso Tocci', 'email': 'tommaso.tocci@10gen.com', 'username': 'toto-dev'}

Message: SERVER-44341 Round-robin policy for shardCollection pre-splitting on zones

Bug:
when pre-splitting a collection during shardCollection using existing zones,
we choose the first shard associated with the zone to place a chunk on.
This can cause a problem if there are many shards associated with the same zone,
the balancer will still schedule migrations to balance the zones afterward.

Implemented solution:
For each new chunk, the shard will be selected in round-robin fashion among the ones associated with its zone.
The new `tagToIndx` StringMap is used to keep the incrementing couters
for each zone (tag).
Branch: v4.2
https://github.com/mongodb/mongo/commit/beba7157ee95d7118594fb8985fe965b329983a2

Comment by Githook User [ 16/Dec/19 ]

Author:

{'name': 'Tommaso Tocci', 'email': 'tommaso.tocci@10gen.com', 'username': 'toto-dev'}

Message: SERVER-44341 Round-robin policy for shardCollection pre-splitting on zones

Bug:
when pre-splitting a collection during shardCollection using existing zones,
we choose the first shard associated with the zone to place a chunk on.
This can cause a problem if there are many shards associated with the same zone,
the balancer will still schedule migrations to balance the zones afterward.

Implemented solution:
For each new chunk, the shard will be selected in round-robin fashion among the ones associated with its zone.
The new `tagToIndx` StringMap is used to keep the incrementing couters
for each zone (tag).
Branch: v4.0
https://github.com/mongodb/mongo/commit/7168b7100a106bec98d643bef9e15b705c439edb

Comment by Githook User [ 13/Dec/19 ]

Author:

{'name': 'Tommaso Tocci', 'email': 'tommaso.tocci@10gen.com', 'username': 'toto-dev'}

Message: SERVER-44341 Round-robin policy for shardCollection pre-splitting on zones

Bug:
when pre-splitting a collection during shardCollection using existing zones,
we choose the first shard associated with the zone to place a chunk on.
This can cause a problem if there are many shards associated with the same zone,
the balancer will still schedule migrations to balance the zones afterward.

Implemented solution:
For each new chunk, the shard will be selected in round-robin fashion among the ones associated with its zone.
The new `tagToIndx` StringMap is used to keep the incrementing couters
for each zone (tag).
Branch: master
https://github.com/mongodb/mongo/commit/35c4be790c1e03898d9c5443f9a33e36f6f40302

Comment by Tommaso Tocci [ 11/Dec/19 ]

There are several policies that can be used to choose the shard within a zone for a specific chunk during the pre-splitting phase. Here you have 2 proposals:

  • Random: for each new chunk we just select a random shard among the ones associated to its zone.
  • Round-robin: for each new chunk, the shard will be selected in round-robin fashion among the ones associated with its zone.

I personally prefer the Round-robin policy because it will guarantee a more even distribution of chunks among the shards also in the case where we have very few shards associated to a zone. So that the balancer will not move any chunks until the first insertions.

Generated at Thu Feb 08 05:05:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.