[DOCS-7145] Clarify the manual on sharding existing collection size limit. Created: 12/Feb/16 Updated: 30/Oct/23 Resolved: 09/Jun/16 |
|
| Status: | Closed |
| Project: | Documentation |
| Component/s: | manual |
| Affects Version/s: | None |
| Fix Version/s: | Server_Docs_20231030 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Wan Bachtiar | Assignee: | Ravind Kumar (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Days since reply: | 7 years, 35 weeks, 1 day ago | ||||||||
| Description |
|
*edit* added note: 8,192 split point limit does not apply to initial sharding of collection (added note here so the description of the ticket does not create mistaken impression that it does). It would be good to revisit and clarify the manual on : https://docs.mongodb.org/manual/reference/limits/#Sharding-Existing-Collection-Data-Size We should improve the clarification on how the two sizes (256GB and 400GB) were calculated/estimated. Also It would be great to revisit the table:
NOTE: For 3.4, need to revisit this as there are limits for empty collections with hashed shard keys. |
| Comments |
| Comment by Randolph Tan [ 14/Jun/16 ] |
|
ravind.kumar The numInitialChunks hard limit is only for v3.4. For the rest, they should be the same for old versions of mongo. |
| Comment by Ravind Kumar (Inactive) [ 13/Jun/16 ] |
|
renctan, Is there anything here that would not apply to 3.0, or possibly 2.6? This might be worth backporting. |
| Comment by Githook User [ 13/Jun/16 ] |
|
Author: {u'username': u'rkumar-mongo', u'name': u'ravind', u'email': u'ravind.kumar@10gen.com'}Message: Signed-off-by: kay <kay.kim@10gen.com> |
| Comment by Ravind Kumar (Inactive) [ 09/Jun/16 ] |
| Comment by Randolph Tan [ 06/Jun/16 ] |
In the first comment, I believe Asya was referring to sharding a collection with existing data. In this scenario, mongos will create new chunks for the collection (as if splitting min->max to several chunks). The second one refers to the user calling the split command explicitly. Note that there is a special case where the 8192 limit applies and this was demonstrated in
If my understanding of the formula is correct, both "size" refers to the BSON Object size. If that's the case, I also don't follow how this formula came about.
Note: The 1000000 limit was added together with the 8192*nShards limit. In other words, this check did not exist in v3.2. |
| Comment by Ravind Kumar (Inactive) [ 02/Jun/16 ] |
|
wan.bachtiar, my understanding is that the first calculation is using the maximum BSON document size of 16MB. While this is good for estimating maximum collection size based on shard key size and chunk size, I imagine most customers do not approach that limit very often. So the second formula would just be the average document size of the target collection instead of the maximum BSON document size. For example, a 64 bit shard key with a 64 MB chunk size would allow for up to 8TB of size, requiring 16 shards to support every split point (3.4+). But if the customer has an average document size of only 4MB, the number of split points would be 4x lower, as would be the number of shards. I wouldn't want a customer to view the formula / table and end up with a much larger number of shards than they need for their collection. Maybe I'm over-estimating the issue here. |
| Comment by Ravind Kumar (Inactive) [ 25/May/16 ] |
|
I've updated the code review based on some of the discussions here. Please review when you get a chance. wan.bachtiar, I folded in the number of shards as a measure of the minimum number needed to support a given number of split points. |
| Comment by Asya Kamsky [ 31/Mar/16 ] |
All of this discussion is ONLY applicable to enabling sharding on an existing non-sharded collection. None of the discussion applies to already sharded collections. |
| Comment by Asya Kamsky [ 22/Mar/16 ] |
|
please note that this ticket and Anyway, 8192 is a non-issue for initial sharding of collection. It's only a limit for manual running of split command. |