[SERVER-14662] Positional projection queries (and positional update ops) should fail with error if multiple arrays encountered Created: 23/Jul/14 Updated: 03/Jun/19 Resolved: 28/Jul/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | João Acabado | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Won't Fix | Votes: | 5 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Query 10 (02/22/16) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Running the following commands on 2.6 and 2.4 gives different output when running following commands:
If we remove the "$exists" : true criteria the output will be the same. Is this change documented anywhere? There could be compatibility issues when upgrading. I'm attaching a screen shot where I reproduced this behavior. |
| Comments |
| Comment by Ramon Fernandez Marina [ 28/Jul/18 ] | ||||||||||||||||||||||||||||
|
I'm updating the metadata on this ticket to close it as "Won't Fix", since the changes in it were later reverted and never made it into a stable MongoDB version. SERVER-18500 remains open to discuss and implement improvements in this area. See also the documentation on using arrayFilters for more information. Regards, | ||||||||||||||||||||||||||||
| Comment by Githook User [ 17/Nov/16 ] | ||||||||||||||||||||||||||||
|
Author: {u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}Message: Revert " This reverts commit 2eea3f09aea9c92df9aa0d4e47840869bf04d7b8. | ||||||||||||||||||||||||||||
| Comment by Githook User [ 17/Nov/16 ] | ||||||||||||||||||||||||||||
|
Author: {u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}Message: Revert " This reverts commit 0df1eabb730f54c72dd545fc2b5f5c7f630e7f3c. | ||||||||||||||||||||||||||||
| Comment by Githook User [ 17/Nov/16 ] | ||||||||||||||||||||||||||||
|
Author: {u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}Message: Revert " This reverts commit 2eea3f09aea9c92df9aa0d4e47840869bf04d7b8. | ||||||||||||||||||||||||||||
| Comment by Githook User [ 17/Nov/16 ] | ||||||||||||||||||||||||||||
|
Author: {u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}Message: Revert " This reverts commit 0df1eabb730f54c72dd545fc2b5f5c7f630e7f3c. | ||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 17/Nov/16 ] | ||||||||||||||||||||||||||||
|
This change produced undesirable effects, and will be reverted in | ||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 10/Feb/16 ] | ||||||||||||||||||||||||||||
|
If a positional operator appears in a projection or update, and there were multiple array matches against the document, we will issue an error. Example:
Previously:
Now:
Example:
Previously:
Now:
| ||||||||||||||||||||||||||||
| Comment by Githook User [ 10/Feb/16 ] | ||||||||||||||||||||||||||||
|
Author: {u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}Message: | ||||||||||||||||||||||||||||
| Comment by Githook User [ 10/Feb/16 ] | ||||||||||||||||||||||||||||
|
Author: {u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}Message: | ||||||||||||||||||||||||||||
| Comment by J Rassi [ 15/May/15 ] | ||||||||||||||||||||||||||||
I'm not convinced that the positional operator can be made to have intuitive semantics for the example quoted above. | ||||||||||||||||||||||||||||
| Comment by NOVALUE Mitar [ 15/May/15 ] | ||||||||||||||||||||||||||||
|
But why then even do this, if we can simply fix this for all cases and have pretty simple and expected semantics? | ||||||||||||||||||||||||||||
| Comment by J Rassi [ 15/May/15 ] | ||||||||||||||||||||||||||||
|
mitar: improving semantics for these operations is being tracked at SERVER-18500. | ||||||||||||||||||||||||||||
| Comment by NOVALUE Mitar [ 15/May/15 ] | ||||||||||||||||||||||||||||
|
Why fail? There is clear semantics which could be done here? | ||||||||||||||||||||||||||||
| Comment by NOVALUE Mitar [ 11/Apr/15 ] | ||||||||||||||||||||||||||||
|
Moreover, it is not that multiple predicated contradict. But they specify their own conditions. It is like in any boolean logic AND clauses. A AND B does not mean that A and B contradict, but that both A and B has to be true. If multiple predicates would contradict, then there would be a no matching document to begin with. Even when A is stricter (matches less) than B, that is not a contradiction. They obviously do not contradict because there is a non-empty set of documents (with the first one in that set) of where they overlap. You maybe mean that there is multiple values they return as a position and those values are not all the same. But that is normal, because each sub-predicate returns the value for themselves. But for the predicate as a whole then those values have to be combined. And it is so that the stricter one is the one which is the result of an AND top-level predicate. In arrays it is easy to know which is the stricter, the latter. | ||||||||||||||||||||||||||||
| Comment by NOVALUE Mitar [ 11/Apr/15 ] | ||||||||||||||||||||||||||||
|
After more thought I think this is a bug, not a feature request. $ should be returning the position of an element which satisfied all conditions, not an arbitrary condition. Because also selecting is done based on all conditions (when there is AND between predicates). I think the fact that you internally split combined AND predicate into smaller predicates is not a reason to not then compute what is the position of an overall AND predicate. You want $ to represent the first match of all predicates. Currently you are not returning that. So this is a bug? And luckily for arrays it seems there is an easy way to compute that position. You just take max between all sub-predicates. Even with OR query there is a reasonable position to return if $or matches in order, like in most programming languages does. Then the first predicate with matches also has an associated position. But I think that in Mongo you do not have a defined order for $or and short-circuiting? | ||||||||||||||||||||||||||||
| Comment by NOVALUE Mitar [ 11/Apr/15 ] | ||||||||||||||||||||||||||||
|
As I reported in
The first nested.required._id is duplicated by one in $elemMatch. The issue is that if such queries are made automatically (like they are in my case) then it is hard to detect such things in advance. Wouldn't a simple semantic always be that the highest position should win? So the highest index number is the one $ uses? Because if you are traversal the array the later are you in the array the more specific you are in some way. If one predicate can determine the value by just looking at the first value of the array, and another by looking at 100 values, then the second one wins. As I understand, you already ask all predicates for their $ value. So you should just add code which picks the largest number between two. Not the second number. So semantics could be: $ corresponds to the first document which matched all the predicates, which mean the location with the highest index number, because if there was a match earlier, then some other highest index number would be the $ position. I think this is pretty like one expect and it does not surprise. Also, in | ||||||||||||||||||||||||||||
| Comment by J Rassi [ 16/Jan/15 ] | ||||||||||||||||||||||||||||
|
To reiterate Thomas above: when multiple expressions in the query predicate perform implicit array matches, the value of the positional operator is not specified. This is documented in the reference page for the positional operator, but users that have not read the reference page will understandably be surprised when the element they are intending to project/update does not match with the element actually projected/updated. The most straightforward way to address this issue in the server is to fail these operations with an error. As a more ambitious approach, we could actually define the semantics for these operations. For example, we could determine which match expression should be generating the positional operator's value by using prefix matching on the path containing the positional operator. It's not always possible to disambiguate in these cases, but we could generate an error for cases when the ambiguity can't be resolved. Example of a case where the ambiguity can be resolved:
Example of a case where the ambiguity can't be resolved:
| ||||||||||||||||||||||||||||
| Comment by João Acabado [ 20/Nov/14 ] | ||||||||||||||||||||||||||||
|
rc2.8 still has this behavior. Migrations from 2.4 to rc2.8 will suffer from this. | ||||||||||||||||||||||||||||
| Comment by David Storch [ 08/Aug/14 ] | ||||||||||||||||||||||||||||
|
As described in
One might expect the second update() above to unset the slug field from the second array element. However, there are contradictory predicates over "nested.a", causing the update to be applied to the first array element rather than the second. | ||||||||||||||||||||||||||||
| Comment by Thomas Rueckstiess [ 06/Aug/14 ] | ||||||||||||||||||||||||||||
|
Hi João, Thanks, we're following your suggestion and are going to update the documentation to make the interaction of $elemMatch and positional projection clearer. A good starting point in the source code for the implementation of projection is here: https://github.com/mongodb/mongo/blob/35f827aef4ddfcf9acb9e4b90cb200ff29183b7c/src/mongo/db/exec/projection_exec.cpp#L237 We're rephrasing this ticket to track the fact that the positional operator semantics are not well-defined and keep it open as a feature request. Regards, | ||||||||||||||||||||||||||||
| Comment by João Acabado [ 04/Aug/14 ] | ||||||||||||||||||||||||||||
|
Thanks a lot for your explanation, we went with the little change. From a user point of view I would suggest to clarify the documentation regarding the criteria on the find command. I would have avoided this had I known if the $elemMatch implies the $exists but from the little I read on the $ operator and the find I thought it would return me some element that matched all the criteria. For curiosity, could point me to the source code where this is implemented? I fiddled around your solution but didn't manage to get to it. | ||||||||||||||||||||||||||||
| Comment by Thomas Rueckstiess [ 01/Aug/14 ] | ||||||||||||||||||||||||||||
|
Hi João, Thanks for reporting this. We have re-written the query engine in 2.6 and there have been some changes. You are correct in that the two versions return different results; this is due to the fact that the projection language in such a case is not well-defined, because the two predicates provide contradicting behavior. This has been the case for both 2.4 and 2.6. They each return somewhat arbitrarily one of the array elements, they just happen to be different in both versions due to implementation changes. I'd expect there to be many other similar cases where the results in both versions happen to be the same. I'd like to explain some more what is going on (specifically in the new implementation in 2.6) When the query engine gets a projection query like the one you propose
it turns the query into a match expression tree, that would look like this:
The tree always gets sorted into a canonical form, and the EXISTS node happens to be at the end. After evaluating the match expression, the projection stage queries each of the leaves (the yellow nodes in the graph: EQ, EQ inside ELEMMATCH, EXISTS) for the position of the match (the value of $). It just happens that the EXISTS is asked at the end, and because the existence of "Items" can be determined after looking at the first element ($ = 0) of the array, it returns 0 and overwrites all previous $ values, including $=1 from the EQ inside the ELEMMATCH. We are discussing internally if this special case of providing contradicting information is something the query planner should be able to handle and if there is a "correct" solution. For now, I would suggest the workaround is to not provide conflicting predicates in terms of the position operator. You can for example just remove the {$exists: true} clause from your query, because the $elemMatch clause implies that the element exists. This would return the result that I think you're expecting:
We'll update this ticket when we made a decision whether the query planner should break the tie when two predicates contradict in their positional $ value. Regards, |