[SERVER-22571] splitVector should return a cursor so that it can return more than 16Mb worth of split points Created: 11/Feb/16 Updated: 06/Dec/22 Resolved: 15/Nov/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.0.9 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Dmitry Ryabtsev | Assignee: | [DO NOT USE] Backlog - Sharding NYC |
| Resolution: | Won't Do | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Sharding NYC
|
||||
| Sprint: | Sharding 17 (07/15/16), Sharding 18 (08/05/16) | ||||
| Participants: | |||||
| Case: | (copied to CRM) | ||||
| Description |
|
Currently the 16Mb response size limit prevents sharding an existing large collection if doing so would generate too many initial split points. Changing splitVector to return a cursor should allow us to shard arbitrarily large existing collections without needing to increase the chunk size. Original Summary: splitVector fails if the document containing the split points exceeds 16 Mb limit Original Description: Generally that's the case when either initial split happens for a large collection and/or the chunks size is configured to be small. The symptoms when it happens during autosplit are:
Can be reproduced manually as:
There are two problems with the existing behaviour: If we are only to address the first problem, we could make the error message to be more apparent: For problem (2) we could make splitVector to truncate the list of split points to ensure it fits into 16 Mb limit. |
| Comments |
| Comment by Kaloian Manassiev [ 15/Nov/21 ] |
|
This ticket has been open for more than 5 years and due to competing priorities we have not been able to schedule it. While it may be a good idea for some narrow use cases, given the amount of work required, we will not fix it in the foreseeable future. The way it works now is sufficient for our internal use cases. |
| Comment by Andrew Ryder (Inactive) [ 08/Jun/16 ] |
|
schwerin, that makes sense. Leveraging $sample for other tasks is great. In the case of splitVector (and probably the hadoop-connector too) the sample size just needs to be somewhat larger than the expected the output so the distribution comes out fairly flat. |
| Comment by Andy Schwerin [ 08/Jun/16 ] |
|
To answer your question, andrew.ryder, I had thought to have current users of splitVector instead use a $sample aggregation, but oversample as you suggest in your second concept. However, the work of choosing which of those returned samples to use as split points would be the caller's responsibility or the job of a later stage in the aggregation pipeline that contained the $sample. The goal would be to remove the use of the splitVector command by sharding. This has the advantage of reducing the amount of reading done by splitVector, as you mentioned. The impediment to this approach is that splitVector also does chunk size and document count estimation, today, and finding a substitute for that functionality would be difficult in the short term. That might argue for keeping splitVector around for a little longer. |
| Comment by Max Hirschhorn [ 07/Jun/16 ] |
|
dmitry.ryabtsev, andrew.ryder, it is my understanding from pasette and spencer that the usage of the "splitVector" command in both the Spark and Hadoop connections is undesirable; the "splitVector" command is intended to be an internal command for supporting sharded clusters. Regarding the Spark and Hadoop connectors
The documentation page doesn't list any assumptions about the behavior of the "splitVector" command - it doesn't even really explain to the user what the command is going to be used for. I would prefer any discussion about the requirements for the Spark and Hadoop connections to be done in SERVER-24274 because a separate ticket has already been filed to try and address the permissions issue with the "splitVector" command approach. Regarding an earlier comment in this ticket
I think it might be worth clarifying what it is exactly you want an even distribution of.
I might be missing some details about how chunk ranges and balancing work, but my impression is that the system would be self-correcting. For example, even if the initial assignments of chunks to shards were slightly uneven via some bias in the storage engine's random cursor, then the system would eventually move chunks from shards with more and/or larger chunks to shards with fewer and/or smaller chunks. |
| Comment by Andrew Ryder (Inactive) [ 07/Jun/16 ] |
|
schwerin, there was some debate/confusion here over how you propose to utilize $sample. Do you mean:
#1 I think would suck. |
| Comment by Andrew Ryder (Inactive) [ 07/Jun/16 ] |
|
schwerin, I agree with Dmitry. Also note that our own hadoop-connector expects splitVector to provide a flat distribution in order to parcel up work evenly: https://docs.mongodb.com/ecosystem/tutorial/getting-started-with-hadoop/#mongodb |
| Comment by Dmitry Ryabtsev [ 07/Jun/16 ] |
|
Hi schwerin, I don't think using $sample is a good idea as it won't guarantee an even distribution of split points across the data range. |
| Comment by Andy Schwerin [ 06/Jun/16 ] |
|
An alternative might be for us to start using a $sample aggregation to establish the split points. That would get us a cursor for free, and might have better performance than splitVector, since it wouldn't need to do a whole index scan. dmitry.ryabtsev, do you have other uses of splitVector that would preclude this approach? |