[SERVER-67378] Index null values and missing values differently Created: 18/Jun/22  Updated: 28/Jun/22  Resolved: 22/Jun/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sergei Kuzmin Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-12869 Index null values and missing values ... Backlog
Related
related to SERVER-54811 Bad query plans for $exists:true queries Backlog
is related to SERVER-12869 Index null values and missing values ... Backlog
Operating System: ALL
Participants:

 Description   

There is already a very old existing issue: SERVER-12869. The original reporter is already inactive. This issue is an attempt to reach the MongoDB developers and try to stress out that the issue effectively disallows use of index `{$exists: true}` operation. And as such it is a significant bug. Not being able to use index for $exists comes at surprise for developers and many don't even notice an issue until there is enough data, or until explicitly checking the explain() results. Even with explain() the cause of behavior was not understood until I made an assumption that the index doesn't distinguish between nulls and non-existing items and then confirmed that experimentally.
For some reason I cannot tag specific MongoDB developers in this Jira server so have to create the ticket to attract attention to my comments on the mentioned ticket. I've provided the potential (pretty obvious) solution there, and propose to discuss viability in the original ticket.



 Comments   
Comment by Sergei Kuzmin [ 28/Jun/22 ]

Ping on that.

Comment by Sergei Kuzmin [ 22/Jun/22 ]

Could we discuss approach viability and do some cost estimation instead? Preferably in the original ticket. Checking for field existence is a very common situation and I emphasize that opposing to what the original ticket says, {$exist: true} does not make use of the index at all. This should be at the very least documented.

Would it be feasible to reserve BSON value for non-existing items just for index purposes?
We would need an index file version bump. Is there an automatic index file migration for such cases? Is query planner intended to work with multiple index file versions at the same time?

Comment by Chris Kelly [ 22/Jun/22 ]

Hi Sergei,

From my understanding, SERVER-12869 is going to be a major undertaking requiring a new index version which has to be scheduled against all the other large projects that the query team has in the backlog. That being said, this has been reported pretty often as an issue.

I appreciate you taking the time to share more context around the issue in that ticket, and will follow up on it for more info. I'll go ahead and close this ticket though since it's a duplicate.

Christopher

 

Generated at Thu Feb 08 06:07:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.