-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.4.1
-
Component/s: Sharding
-
None
-
ALL
-
(copied to CRM)
We recently (mid-october) migrated our production sharded cluster to 4.4.1 and we are experiencing a strange behavior with the balancer.
When we started the new cluster, we calculated the number of initial chunks using the default 64MB chunksize, and we restored the collection after a pre-split operation.
So we started with 2700 chunks, hashed shard key, for a collection of 170GB and 16M+ documents. Average document size in collection is 10KiB and consistent across the shards.
On our dev/test shard, here is a shard detail:
Shard shard-local-rs05 at XXXX data : 28.19GiB docs : 2766359 chunks : 450 estimated data per chunk : 64.16MiB estimated docs per chunk : 6147
The problem is that, in production, the balancer has been running like crazy and we now have almost 60K of very small chunks.
data : 33.16GiB docs : 3324684 chunks : 11908 estimated data per chunk : 2.85MiB estimated docs per chunk : 279 Totals data : 170.85GiB docs : 17131823 chunks : 59542 Shard shard-04 contains 20.04% data, 20.01% docs in cluster, avg obj size on shard : 10KiB Shard shard-01 contains 20.25% data, 20.28% docs in cluster, avg obj size on shard : 10KiB Shard shard-02 contains 20.51% data, 20.51% docs in cluster, avg obj size on shard : 10KiB Shard shard-03 contains 19.77% data, 19.77% docs in cluster, avg obj size on shard : 10KiB Shard shard-05 contains 19.41% data, 19.4% docs in cluster, avg obj size on shard : 10KiB
Please not that we did NOT change the chunk size, and that the config collection does not include any document with _chunksize.
Can anybody please advise on what we should do and where to investigate ?
I'm afraid this could lead to performance issues or worse, unavailability in case of hard limits in chunks count or size.
- duplicates
-
SERVER-55028 Improve the auto-splitter policy
- Closed