[DOCS-983] Warning required in Pre-Splitting Section - Existing Data Created: 15/Jan/13 Updated: 30/Oct/23 Resolved: 08/Feb/13 |
|
| Status: | Closed |
| Project: | Documentation |
| Component/s: | manual |
| Affects Version/s: | None |
| Fix Version/s: | Server_Docs_20231030 |
| Type: | Improvement | Priority: | Critical - P2 |
| Reporter: | Adam Comerford | Assignee: | Bob Grabar |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Participants: | |||||
| Days since reply: | 11 years, 1 week, 5 days ago | ||||
| Description |
|
A user wanted to pre-split and did not realize that the collection had to be empty to achieve it. This is somewhat implied by the pre-split term but an explicit warning makes it more clear that this is a requirement. |
| Comments |
| Comment by auto [ 08/Feb/13 ] |
|
Author: {u'date': u'2013-02-08T22:45:26Z', u'email': u'samk@10gen.com', u'name': u'Sam Kleinman'}Message: |
| Comment by auto [ 08/Feb/13 ] |
|
Author: {u'date': u'2013-02-08T22:14:28Z', u'email': u'samk@10gen.com', u'name': u'Sam Kleinman'}Message: merge: |
| Comment by auto [ 08/Feb/13 ] |
|
Author: {u'date': u'2013-02-08T16:32:19Z', u'email': u'bob.grabar@10gen.com', u'name': u'Bob Grabar'}Message: |
| Comment by auto [ 08/Feb/13 ] |
|
Author: {u'date': u'2013-02-01T21:45:03Z', u'email': u'bob.grabar@10gen.com', u'name': u'Bob Grabar'}Message: |
| Comment by Adam Comerford [ 15/Jan/13 ] |
|
Aye, looks good, as long as we poke them to remind them that they may already have chunks before they run the split commands I think that's sufficient warning. Adam |
| Comment by Scott Hernandez (Inactive) [ 15/Jan/13 ] |
|
So it sounds like you just want a warning that the balancer creates chunks when sharding a collection? That seems like a reasonable reminder, along with a link back to the docs for understanding how things work. The intro of that section directly mentions this btw. I'm not suggesting that people don't misunderstand nor make mistakes after reading this section. How about this? Warning – Please be careful when splitting chunks. As you may have read<link-here>, when the collection was sharded chunks were automatically created for the collection to evenly spread the collection, if there was existing data. Doing additional splits ("pre-splitting" is a reserved term used when the collection is void of data) requires consideration of the resulting chunk sizes (by documents/size). You do not want to do splits which will cause some chunks to be much larger than others – this will lead to balancing (based on count of chunks, not their size) which may cause extreme load/data-distribution problems. |
| Comment by Adam Comerford [ 15/Jan/13 ] |
|
Well, I disagree, completely - I think it is easy to make the mistake when following this section and the warning is valid. I also have evidence that it happens, can you point to some showing that it does not? If you do not know that enabling sharding on a collection creates chunks immediately then it's not obvious that you cannot just "create" your own chunks. I realize that if you think about it and what having existing data means then it doesn't make sense, but do you really expect users to read the documentation serially to gain the appropriate understanding? |
| Comment by Scott Hernandez (Inactive) [ 15/Jan/13 ] |
|
I don't see how this warning helps; It is confusing at best. Adding docs which are more confusing than helpful should be avoided. What is the problem you want to address? I'd guess the general misunderstanding is the relationship between chunks, documents per chunks (size of specific chunks), and how balancing works (based on chunk count not size/doc-count). Does that sound right? If so, let's address that and cleanup the general docs and callout anything related to common misunderstandings surrounding that. Like doing splits to cause an imbalance in the avg. number of docs/size of each chunk. This comes up not just on splits but also when deleting docs unevenly across chunks/shards for example. |
| Comment by Sam Kleinman (Inactive) [ 15/Jan/13 ] |
|
Willing to make changes as requested, but unclear on the correct/best solution. |
| Comment by Adam Comerford [ 15/Jan/13 ] |
|
My counter to that would be: "pre" and its implications didn't work as a sufficient warning (I basically had this conversation already with the user). I'd rather play it safe and avoid another case like the one linked. |
| Comment by Scott Hernandez (Inactive) [ 15/Jan/13 ] |
|
This warning makes little sense. While the "pre" of the pre-split refers to doing splits before there is data in the collection there is nothing wrong with splitting chunks after there is data. The general warning is that splits should not be done to produce a system where the number of chunks are evenly distributed but the data/docs are not. If there is no data than any number of splits and chunks can be created and balanced by chunk count (which the balancer does for you). But if you have existing data then you need to be careful not to do splits which produce different sizes of chunks by storage/document-count. |
| Comment by auto [ 15/Jan/13 ] |
|
Author: {u'date': u'2013-01-15T15:04:45Z', u'email': u'samk@10gen.com', u'name': u'Sam Kleinman'}Message: |
| Comment by auto [ 15/Jan/13 ] |
|
Author: {u'date': u'2013-01-15T15:00:36Z', u'email': u'samk@10gen.com', u'name': u'Sam Kleinman'}Message: merge: |
| Comment by auto [ 15/Jan/13 ] |
|
Author: {u'date': u'2013-01-15T13:33:46Z', u'email': u'adam@comerford.cc', u'name': u'Adam C'}Message: Update source/administration/sharding.txt Per |
| Comment by Adam Comerford [ 15/Jan/13 ] |
|
Pull request is in with proposed wording: |