Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21992

Inconsistent results when grouping by possibly missing values

    XMLWordPrintable

    Details

    • Operating System:
      ALL
    • Sprint:
      Query 11 (03/14/16)

      Description

      My database contains the following documents:

      use test;
      db.test.insert(
      	{
      	  	'name' : 'first',
      	  	'color' : 'blue',
      	  	'type' : 'break'
      	}
      )
       
      db.test.insert(
      	{
      	  	'name' : 'second',
      	  	'color' : 'red',
      	  	//'type' : 'break'
      	}
      )
      

      Note that the second document does not have a type field, it's missing. The aggregation pipeline produces inconsistent results sometimes showing 'type' :null in the id field, and sometimes not.

      For example, the following aggregation does NOT return the missing type in the key:

      db.test.aggregate([
      	{
      	  	$group : {
      	  	  	'_id' : {
      	  	  	 	'name' : '$name',
      	  	  	 	'type' : '$type',
      	  	  	 	'color' : '$color' 	
      	  	  	},
      	  	  	'count' : {$sum : 1}
      	  	}
      	  	
      	},
      	{
      	  	$group : {
      	  	  	'_id' : {
      	  	  	 	'type' : '$_id.type',
      	  	  	 	'color' : '$_id.color' 	
      	  	  	},
      	  	  	'count' : {$sum : 1}
      	  	}
      	  	
      	}
      ])
      

      Results in:

      { "_id" : { "type" : "break", "color" : "blue" }, "count" : 1 }
      { "_id" : { "color" : "red" }, "count" : 1 }
      

      However, this aggregation, returns back a 'type' : null,

      db.test.aggregate([
      	{
      	  	$group : {
      	  	  	'_id' : {
      	  	  	 	'name' : '$name',
      	  	  	 	'type' : '$type',
      	  	  	 	'color' : '$color' 	
      	  	  	},
      	  	  	'count' : {$sum : 1}
      	  	}
      	  	
      	},
      	{
      	  	$group : {
      	  	  	'_id' : {
      	  	  	 	'type' : '$_id.type',
      	  	  	 	//'color' : '$_id.color' 	
      	  	  	},
      	  	  	'count' : {$sum : 1}
      	  	}
      	  	
      	}
      ])
      

      Results in:

      { "_id" : { "type" : "break" }, "count" : 1 }
      { "_id" : { "type" : null }, "count" : 1 }
      

      Note that the only difference was the comment out of the 'color' field in the second group.

      The results should be consistent between these two queries. Either the first results should look like:

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-query-optimization Backlog - Query Optimization
              Reporter:
              apara Alex Paransky
              Participants:
              Votes:
              1 Vote for this issue
              Watchers:
              17 Start watching this issue

                Dates

                Created:
                Updated: