-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: 5.0.21
-
Component/s: None
-
None
-
Catalog and Routing
-
Sharding EMEA 2023-10-16
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
During recent testing balancer enhancements I found that initially sharding a collection with a small chunk size will significantly increase the duration to complete sharding collection relative to sharding an empty collection and loading the data with the same chunk size.
Test Scenario - MongoDB 5.0
1:
- 4TB Collection Size
- 32KB Doc Size
- 1MB Chunk Size
- Shard Existing Unsharded Collection with sh.shardCollection
2:
- Set 1MB Chunk Size
- Shard Empty Collection with sh.shardCollection
- Insert Documents
- 32KB Doc Size
- 1MB Chunk Size
Scenario 1 will take much longer than Scenario 2