[SERVER-14832] Distinct command result set may include or exclude null/undefined depending on presence of index Created: 08/Aug/14 Updated: 24/Jan/24 |
|
| Status: | Needs Scheduling |
| Project: | Core Server |
| Component/s: | Index Maintenance, Querying |
| Affects Version/s: | 2.6.0, 2.6.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Michael Narayan | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 8 |
| Labels: | indexv3, open_todo_in_code, query-44-grooming, storch | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
And now with an additional document:
|
||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||
| Description |
|
When distinct is called on a field for which there exists a document with an empty array in said field, it will return null in the result if there is an index on the field. |
| Comments |
| Comment by Adrien Jarthon [ 14/May/20 ] | |||||||||||||||||||||||||||||
|
Ah ok, I didn't though this would be possible:
| |||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 14/May/20 ] | |||||||||||||||||||||||||||||
|
The same problem exists for undefined - if there is a document with field present and having value undefined we would fail to return it, which would be incorrect. | |||||||||||||||||||||||||||||
| Comment by Adrien Jarthon [ 14/May/20 ] | |||||||||||||||||||||||||||||
|
Well ok for the "null" but wouldn't it work with "undefined" then? (it looks like it's "undefined" that is returned in recent versions of mongo) | |||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 14/May/20 ] | |||||||||||||||||||||||||||||
|
> if a field is missing or an array is empty no value should end-up in the distinct result, right? If the field exists and has the value null we are supposed to return null as part of the result set. This is not currently possible. If we could omit null then your workaround would work. | |||||||||||||||||||||||||||||
| Comment by Adrien Jarthon [ 13/May/20 ] | |||||||||||||||||||||||||||||
|
Thanks for the confirmation! I'm sure I understand why it would require a new index format though, can't you filter out the null or undefined entries when scanning the index for a distinct query? or even after the fact before returning the response to the client? Because as far as I understand there's no cases where we expect the distinct query to return such values? if a field is missing or an array is empty no value should end-up in the distinct result, right? | |||||||||||||||||||||||||||||
| Comment by David Percy [ 13/May/20 ] | |||||||||||||||||||||||||||||
|
This is a bug: indexes shouldn't change the meaning of a query. Unfortunately it's not an easy fix, because our current index format has some ambiguities. (It uses null to represent missing fields, and multikey indexes use undefined to represent "zero items" for empty arrays.) We don't want to disable DISTINCT_SCAN plans, so fixing this will require us to introduce a new index format. | |||||||||||||||||||||||||||||
| Comment by Adrien Jarthon [ 06/May/20 ] | |||||||||||||||||||||||||||||
|
Just so we know, is this a bug or a feature? shall we expect this behavior to change or just patch our code to handle this case? So far I wrote a Mongoid (6.1.1) patch to circumvent this:
| |||||||||||||||||||||||||||||
| Comment by Robert Weissmann [ 20/Jul/18 ] | |||||||||||||||||||||||||||||
|
Please fix this. Don't get the difficulty. Thanks. | |||||||||||||||||||||||||||||
| Comment by Michael Smith [ 29/Jan/18 ] | |||||||||||||||||||||||||||||
|
This occurs for string fields, too, in 3.6.2:
In the second query, where an index is used, MongoDB returns a null for the distinct query. | |||||||||||||||||||||||||||||
| Comment by Tom Grossman [ 10/Oct/16 ] | |||||||||||||||||||||||||||||
|
latest version (3.2) returns `undefined`, the issue still exists. | |||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 19/Apr/16 ] | |||||||||||||||||||||||||||||
|
Current master now returns [ undefined ] rather than null when there's an index. | |||||||||||||||||||||||||||||
| Comment by Max Jacobson [ 04/Jan/16 ] | |||||||||||||||||||||||||||||
|
Hello! I'm observing exactly the issue reported by Michal Narayan. This is using MongoDB 3.0.7 Here's some output:
Kind of mysterious | |||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 08/Aug/14 ] | |||||||||||||||||||||||||||||
|
Hmmm, I can no longer find the session where I tried 2.7.4, but I can indeed reproduce this behavior in 2.7.4 as well so it must have been pilot error on my side – apologies for the confusion. | |||||||||||||||||||||||||||||
| Comment by Michael Narayan [ 08/Aug/14 ] | |||||||||||||||||||||||||||||
|
I should probably also add that the null that appears is not a "standard" one (I initially noticed the issue because it was breaking a BSON load in Ruby Moped):
| |||||||||||||||||||||||||||||
| Comment by Michael Narayan [ 08/Aug/14 ] | |||||||||||||||||||||||||||||
|
I'm actually still seeing this on 2.7.4:
as well as the latest nightly (9d1e5f95a2739a7672b697deb7d1da04ca8c4117). | |||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 08/Aug/14 ] | |||||||||||||||||||||||||||||
|
mnarayan01, thanks for reporting this. We're able to reproduce the behavior you describe in 2.6.3, but the issue has already been fixed in the latest development version (2.7.4). Regards, |