[SERVER-14228] Setting batchSize and sort on a cursor in sharded collection causes fewer than all documents to be returned Created: 11/Jun/14 Updated: 10/Dec/14 Resolved: 12/Jun/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 2.4.8, 2.4.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Jason Liszka | Assignee: | David Storch |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Steps To Reproduce: | This reproduces on a single instance:
|
||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Description |
|
Here is the correct number of documents:
If you specify a batchSize and a sort, you get fewer than all the documents:
The number of documents returned appears to be K + batchSize, for some K:
K (3404 in this example) happens to match the number of documents matching the query on one shard (ads1):
This relationship between K and the number of records on ads1 is borne out on other queries:
If the batch size is not specified or the cursor is not sorted, the problem goes away:
|
| Comments |
| Comment by David Storch [ 12/Jun/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
jliszka, glad I could help! | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jason Liszka [ 12/Jun/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hinting {_id: 1} fixed the problem. I don't think we need a backport to 2.4.x. We can work around it by not setting batchSize on the cursor.
Thanks! | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 12/Jun/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi jliszka, Thanks for your prompt response, and for providing the explain output. From the output you provided, I can see that ads0 shard is using index {fs: 1, nt: 1} whereas ads1 and ads2 are using the _id index. Due to the bug fixed in
Yes, the root cause is the same. Glad to hear that you have verified the fix in 2.6.0.
If possible, hinting {_id: 1} could be a workaround. This will force the shards to obtain the sort via an index scan, and prevent missing batches due to erroneous topK sorts on any of the shards. Note that this will cause the entire _id index to be scanned on each shard, so this workaround should only be used if the performance impact of a full index scan is acceptable.
The fix in 2.6 was part of a much larger rewrite of the query engine. Therefore, it cannot be trivially backported to 2.4. A fix in 2.4 would require an entirely new patch (and may be difficult to engineer due to some larger abstraction and design issues that were resolved in 2.6). Is a fix in 2.4 critical for your application? Best, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jason Liszka [ 12/Jun/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Dave, Thanks for the explanation. I believe the behavior described in https://jira.mongodb.org/browse/SERVER-12438 differs from what we're seeing in a couple of ways: 1. The query is using an index (both fs and _id are indexed in the example I provided, see explain() output below), and However, it does appear to be fixed in 2.6.0. Do you think the root cause is the same? What to do you suggest as a workaround? Is there any chance of a backport of the fix to 2.4.x? Here's the explain() output. Worth noting that we get all the results from the ads1 shard (3434 of them at this point in time), plus one batch.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 12/Jun/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi jliszka, Thanks for reporting the issue. This appears to be a known bug in 2.4.x versions which has been fixed in 2.6.0 under The MongoDB wire protocol specifies that batchSize and limit values set on the client side are both passed to the server in a single "ntoreturn" field. This is a limitation of the wire protocol. The consequence is that the server does not know whether to interpret ntoreturn as a batchSize or as a limit. Usually, the server logic is the same for both meanings. However, batchSize with sort should be handled differently than limit with sort when an in-memory sort is required. In the limit with sort case, we know that we only ever need to return "notoreturn" results, so the server performs a topK sort. This is more efficient than a full sort, and requires less memory. However, if ntoreturn means batchSize, then a full sort is required. 2.4.x versions erroneously assumed that the ntoreturn was a limit: it did a topK and then closed the cursor without returning further results, which is the bug reported here. In 2.6, the server does a topK for the first batch but leaves the cursor open. If further batches are requested by the client, then it switches over to a full in-memory sort in order to return the subsequent batches. I hope this explanation was helpful, and please feel free to reach out with further questions. To address your point above:
You are correct that this problem only applies when the plan selected involves an unindexed sort. I suspect that in your case, the server is choosing to use an index on the "fs" field rather than scanning the _id index in order to obtain the desired sort order. Can you confirm by posting the explain output for the problematic query? Best, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jason Liszka [ 11/Jun/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The single-instance repro in "Steps to Reproduce" is https://jira.mongodb.org/browse/SERVER-7267, which is not what we're seeing here. That bug is about only getting one batch when you sort on a non-indexed field. If you add an index on x in your repro steps, the problem should go away. In this bug, even if you sort on an indexed field (in my example we're sorting on _id), you get one shard's worth plus one batch of documents. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 11/Jun/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for reporting this jliszka, we can reproduce this behavior and we're investigating. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jason Liszka [ 11/Jun/14 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Mis-pasted the cursorCount function, should be:
|