[SERVER-65202] Undefined in local matches to empty array in foreign if INLJ is used in SBE Created: 01/Apr/22  Updated: 29/Oct/23  Resolved: 11/Apr/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Irina Yatsenko (Inactive) Assignee: Rui Liu
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: QE 2022-04-18
Participants:

 Description   

db.b.find()
{ "_id" : ObjectId("62477ced76646eaf6616e9e3"), "k" : undefined }
 
db.a.find()
{ "_id" : ObjectId("62477ce276646eaf6616e9e1"), "k" : [ ] }
 
// no index on collection "a" -- no matches on "undefined"
db.b.aggregate({$lookup:{from:"a", localField:"k", foreignField:"k", as:"matched"}})
{ "_id" : ObjectId("62477ced76646eaf6616e9e3"), "k" : undefined, "matched" : [ ] }
 
// after creating the index, "undefined" starts matching empty arrays
db.a.createIndex({k:1})
db.b.aggregate({$lookup:{from:"a", localField:"k", foreignField:"k", as:"matched"}})
{ "_id" : ObjectId("62477ced76646eaf6616e9e3"), "k" : undefined, "matched" : [ { "_id" : ObjectId("62477ce276646eaf6616e9e1"), "k" : [ ] } ] }

While we've been given green light to do "whatever" for undefined, it's a little concerning that the results aren't consistent across the join types...



 Comments   
Comment by Githook User [ 11/Apr/22 ]

Author:

{'name': 'Rui Liu', 'email': 'rui.liu@mongodb.com', 'username': 'lriuui0x0'}

Message: SERVER-65202 Fix inconsistency of matching undefined with different join algorithms
Branch: master
https://github.com/mongodb/mongo/commit/a3f6a5a1aa650fbb4b163bd47512469158b49a50

Comment by Rui Liu [ 07/Apr/22 ]

irina.yatsenko Yeah I also found that. We're trying to find `Nothing` in a set of only `bsonUndefined`. According to this function, they hash to the same value. I'm not sure if that's expected. Also the equality check shows they are the same as well.

Comment by Irina Yatsenko (Inactive) [ 07/Apr/22 ]

The problem seems to be in `ByteCode::genericIsMember()`. There get:

+p values
$1 = (const mongo::sbe::value::DeepEqualityHashSet<std::pair<mongo::sbe::value::TypeTags, unsigned long>, mongo::sbe::value::ValueHash, mongo::sbe::value::ValueEq, std::allocator<std::pair<mongo::sbe::value::TypeTags, unsigned long> > > &) @0x5634db965400: {
  _values = absl::flat_hash_set<std::pair<mongo::sbe::value::TypeTags, unsigned long>> with 1 elems  = {{
      first = mongo::sbe::value::TypeTags::bsonUndefined,
      second = 0
    }}
}
+p lhsTag
$2 = mongo::sbe::value::TypeTags::Nothing

but

value::bitcastFrom<bool>(values.find({lhsTag, lhsVal}) != values.end())

evaluates to "true".

Generated at Thu Feb 08 06:02:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.