Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23229

Projection incorrectly returns null value instead of missing field to user if projection is covered

    • Query Optimization
    • ALL
    • v8.0
    • QO 2022-07-11
    • 200

      If a document is missing a certain field, then covered projections on that field will incorrectly report the field as having value null.

      Reproduce as follows:

      > db.foo.drop()
      true
      > db.foo.insert({a:1})
      WriteResult({ "nInserted" : 1 })
      > db.foo.find({a:1},{b:1,_id:0})
      {  } // Nothing returned: expected.
      > db.foo.ensureIndex({a:1,b:1})
      {
      	"createdCollectionAutomatically" : false,
      	"numIndexesBefore" : 1,
      	"numIndexesAfter" : 2,
      	"ok" : 1
      }
      > db.foo.find({a:1},{b:1,_id:0})
      { "b" : null } // Null returned: unexpected.
      

      Fixing this issue in such a way that is not backwards-breaking for existing indexes will be difficult. This issue is caused by the fact that the index key generation process generates identical keys for missing values and null values.

      This also causes unintuitive behavior for features that use covered projections, such as $group (see below for details).

      Original description:

      On 3.2.0-rc2 and master, a $group stage with an index does not distinguish between a null and missing value, likely because the index uses a null when a value is missing. Once SERVER-4507 and SERVER-23099 are implemented, this behavior may change.

      > db.foo.insert({a: 1, b: null})
      WriteResult({ "nInserted" : 1 })
      > db.foo.insert({a: null, b: 1})
      WriteResult({ "nInserted" : 1 })
      > db.foo.insert({b: 1})
      WriteResult({ "nInserted" : 1 })
      > db.foo.insert({a: 1})
      WriteResult({ "nInserted" : 1 })
      > db.foo.aggregate({$group: {_id: {a: "$a", b: "$b"}}})
      { "_id" : { "b" : 1 } }
      { "_id" : { "a" : null, "b" : 1 } }
      { "_id" : { "a" : 1 } }
      { "_id" : { "a" : 1, "b" : null } }
      > db.foo.createIndex({a: 1, b: 1})
      {
              "createdCollectionAutomatically" : false,
              "numIndexesBefore" : 1,
              "numIndexesAfter" : 2,
              "ok" : 1
      }
      > db.foo.aggregate({$sort: {a: 1, b: 1}}, {$group: {_id: {a: "$a", b: "$b"}}})
      { "_id" : { "a" : 1, "b" : null } }
      { "_id" : { "a" : null, "b" : 1 } }
      

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            benjamin.murphy Benjamin Murphy
            Votes:
            4 Vote for this issue
            Watchers:
            30 Start watching this issue

              Created:
              Updated: