Details
-
Bug
-
Resolution: Duplicate
-
Major - P3
-
None
-
4.4.1
-
None
-
ALL
Description
We recently (mid-october) migrated our production sharded cluster to 4.4.1 and we are experiencing a strange behavior with the balancer.
When we started the new cluster, we calculated the number of initial chunks using the default 64MB chunksize, and we restored the collection after a pre-split operation.
So we started with 2700 chunks, hashed shard key, for a collection of 170GB and 16M+ documents. Average document size in collection is 10KiB and consistent across the shards.
On our dev/test shard, here is a shard detail:
Shard shard-local-rs05 at XXXX
|
data : 28.19GiB docs : 2766359 chunks : 450
|
estimated data per chunk : 64.16MiB
|
estimated docs per chunk : 6147
|
The problem is that, in production, the balancer has been running like crazy and we now have almost 60K of very small chunks.
data : 33.16GiB docs : 3324684 chunks : 11908
|
estimated data per chunk : 2.85MiB
|
estimated docs per chunk : 279
|
|
|
Totals
|
data : 170.85GiB docs : 17131823 chunks : 59542
|
Shard shard-04 contains 20.04% data, 20.01% docs in cluster, avg obj size on shard : 10KiB
|
Shard shard-01 contains 20.25% data, 20.28% docs in cluster, avg obj size on shard : 10KiB
|
Shard shard-02 contains 20.51% data, 20.51% docs in cluster, avg obj size on shard : 10KiB
|
Shard shard-03 contains 19.77% data, 19.77% docs in cluster, avg obj size on shard : 10KiB
|
Shard shard-05 contains 19.41% data, 19.4% docs in cluster, avg obj size on shard : 10KiB
|
Please not that we did NOT change the chunk size, and that the config collection does not include any document with _chunksize.
Can anybody please advise on what we should do and where to investigate ?
I'm afraid this could lead to performance issues or worse, unavailability in case of hard limits in chunks count or size.
Attachments
Issue Links
- duplicates
-
SERVER-55028 Improve the auto-splitter policy
-
- Closed
-