[SERVER-14519] $and $ne performance degradation in 2.6 Created: 10/Jul/14 Updated: 10/Dec/14 Resolved: 15/Jul/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance, Querying |
| Affects Version/s: | 2.6.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Andrew Ryder (Inactive) | Assignee: | David Storch |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Given this query:
Where tags is an array and indexed, the query is much slower in 2.6.3 when compared to 2.4.10. I have attached a JS script that reproduces my test. I tested with a basic PSS replica-set, 1 million documents, with a pseudo-random set of 5 values (single letter of the alphabet as a string) per document for the tags array. Indexed. Sample output in 2.4.10:
Sample output in 2.6.3:
Hinting $natural performs better on 2.6.3 (for me). |
| Comments |
| Comment by a zhifan [ 11/Sep/14 ] |
|
Hi Dave: |
| Comment by David Storch [ 10/Sep/14 ] |
|
Hi zhifan, We set "Fix version" to indicate when a patch is scheduled to go in, or to indicate the version that contains the fix. This ticket has no "Fix version" because it was resolved as a duplicate of SERVER-12281 rather than resolved as "Fixed". I hope that clarifies things! Best, |
| Comment by a zhifan [ 10/Sep/14 ] |
|
I want to check JIRA issues before picking up a version to use in production. Can you tell me how to understand "Status":"Resolved" and "Fix version":"None"? |
| Comment by David Storch [ 15/Jul/14 ] |
|
This is a case of multiple predicates over a multikey field. Only one of these predicates can be used for building index bounds (combining bounds leads to incorrect query results for multikey indices). The {tags: "A"} predicate has bounds ["A", "A"] whereas the {tags: {$ne: "B"}} predicate has bounds [MinKey, "B"), ("B", MaxKey]. In 2.4 there was logic which tried to pick the "smallest" bounds, which is why the bounds ["A", "A"] are always used instead of [MinKey, "B"), ("B", MaxKey]: https://github.com/mongodb/mongo/blob/v2.4/src/mongo/db/queryutil.cpp#L556-L565 There is an outstanding ticket for introducing similar behavior in 2.6: see SERVER-12281. I'm going to resolve this ticket as a duplicate, but we will revisit the scheduling of SERVER-12281. |
| Comment by Scott Hernandez (Inactive) [ 11/Jul/14 ] |
|
Since you included a js test file the repro steps aren't really needed; a js file is almost always preferred, and if it can be run to verify behavior, even better (by including asserts to validate that, like we do in all jstests). |
| Comment by Andrew Ryder (Inactive) [ 11/Jul/14 ] |
Nothing. All ops are zero, before and after. The only thing running at all is the query. Effect survives restarts, switching back & forth between versions, and index rebuilds on both versions. Easily reproducible. Did you intend to delete the provided "steps to reproduce" by the way? Just so I know in future if this is something useful to provide or not. |
| Comment by Scott Hernandez (Inactive) [ 10/Jul/14 ] |
|
Did you notice the yields? What else is going on during these queries? It looks like this may just be that we use the $ne predicate instead of the equality for the index traversal, which causes us to scan the whole index. This could be due to canonicalizing the query based on the sort of fields (which makes the $ne the first evaluated), if I had to guess. |