[SERVER-19266] An error document is returned with result set Created: 02/Jul/15 Updated: 17/Mar/16 Resolved: 17/Dec/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying, Sharding |
| Affects Version/s: | 2.6.10, 3.0.4 |
| Fix Version/s: | 2.6.12, 3.0.9 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Steven Hand | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backport Completed: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Import Enron message data. On a sharded cluster
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Sharding 9 (09/18/15), Sharding D (12/11/15), Sharding E (01/08/16) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
A query to a sufficiently large sharded collection that requires a sort on an unindexed field and a sufficiently small batch size results in a result set with batch size number of documents plus an error document from each shard returning results. |
| Comments |
| Comment by Githook User [ 17/Dec/15 ] | |||||||||||||||||||||
|
Author: {u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: Make sure that the error result flag is set when receiving one from of the shards during GET_MORE. (cherry picked from commit eb79bd1fe807512e54c68c5982b2a24fa1d66bba) Conflicts: | |||||||||||||||||||||
| Comment by Githook User [ 16/Dec/15 ] | |||||||||||||||||||||
|
Author: {u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: Make sure that the error result flag is set when receiving one from of the shards during GET_MORE. | |||||||||||||||||||||
| Comment by Andy Schwerin [ 07/Oct/15 ] | |||||||||||||||||||||
|
I have confirmed that this bug no longer occurs on master as a result of SERVER-15176. The error message for failed find commands leaves something to be desired, but that is a separate issue.
| |||||||||||||||||||||
| Comment by Randolph Tan [ 17/Sep/15 ] | |||||||||||||||||||||
|
TODO:
| |||||||||||||||||||||
| Comment by David Storch [ 05/Aug/15 ] | |||||||||||||||||||||
|
We expect this problem to be solved by the introduction of the find and getMore commands (see linked ticket SERVER-15176). | |||||||||||||||||||||
| Comment by Randolph Tan [ 08/Jul/15 ] | |||||||||||||||||||||
|
schwerin The getMore command is not yet implemented in mongos right now, so I am going to describe what needs to be changed in the code base as of v3.1.5: 1. Need to have ShardedClientCursor keep track of the result flags responses from the different shards and at the minimum aggregate the error flag. Then, expose this information so mongos can properly propagate the result flags to users. | |||||||||||||||||||||
| Comment by Andy Schwerin [ 08/Jul/15 ] | |||||||||||||||||||||
|
renctan, how involved is the fix? If we upconvert OP_GETMORE to the getMore command in on master, we'll be using a different code path. It would be good to get a minimal regression js test written. | |||||||||||||||||||||
| Comment by Randolph Tan [ 07/Jul/15 ] | |||||||||||||||||||||
|
Note: the error happens only during getMore due to the nToReturn hack that generates a plan with k top sort ORed with the normal plan. The initial batch uses the k top sort and switches to the normal plan when it realizes that actual query requires more than k results. | |||||||||||||||||||||
| Comment by Randolph Tan [ 07/Jul/15 ] | |||||||||||||||||||||
|
It looks like mongod has the error set, but mongos ignores it: https://github.com/mongodb/mongo/blob/r3.1.5/src/mongo/s/strategy.cpp#L625 On the other hand, the error flag is being set on single sharded getMore response since it simply passes what it got from the shard: https://github.com/mongodb/mongo/blob/r3.1.5/src/mongo/s/strategy.cpp#L593 |