[SERVER-16969] First split in the chunk has a higher threshold than the rest Created: 21/Jan/15  Updated: 06/Dec/22  Resolved: 01/Apr/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.8.0-rc5
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Randolph Tan Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-16715 Distribution of data with hashed shar... Closed
Assigned Teams:
Sharding
Operating System: ALL
Participants:

 Description   

If the current total dataSize in the collection within a shard is smaller than given maxChunkSize, splitVector will return early:

https://github.com/mongodb/mongo/blob/r2.8.0-rc5/src/mongo/s/d_split.cpp#L357

Whereas, if there are sufficient amount of bytes in the collection, it can return with a split point as long as the average size in the chunk range is more than maxChunkSize / 2



 Comments   
Comment by Randolph Tan [ 07/Mar/19 ]

Not a huge impact, that's why I made it minor. The user will just observe that there's a higher threshold before we decide to split for the first few chunks of the shard. This probably affects hashed shard key more since the chunks would be initially scattered across all shards and the collection is empty.

Comment by Kaloian Manassiev [ 07/Mar/19 ]

renctan, do you remember what is the customer-visible side effect of this? I don't quite get it from the description.

Comment by Randolph Tan [ 21/Jan/15 ]

It has been like this since 1.8...

Comment by Andy Schwerin [ 21/Jan/15 ]

Also in prior releases, or new in the most recent dev cycle?

Generated at Thu Feb 08 03:42:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.