Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-54769

A pipeline with a $sort and $project against a time-series collection leaves new project in the pipeline

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Querying
    • Query Optimization
    • ALL
    • Hide

      To debug this I stepped through the js commands in timeseries_bucket_limit_count.js up to line 51. 

      This can also be reproduced by running an agg pipeline against the system.buckets.X collection directly,

       db.system.buckets.timeseries_bucket_limit_count_1.aggregate([ { "$_internalUnpackBucket" : { "timeField" : "time", "exclude" : [ ] } }, { "$sort" : { "_id" : 1 } }, { "$project" : { "x" : 1 } } ])
      

      The explain for the query above will help you see the redundant $project.

      Show
      To debug this I stepped through the js commands in timeseries_bucket_limit_count.js up to line 51.  This can also be reproduced by running an agg pipeline against the system.buckets.X collection directly, db.system.buckets.timeseries_bucket_limit_count_1.aggregate([ { "$_internalUnpackBucket" : { "timeField" : "time" , "exclude" : [ ] } }, { "$sort" : { "_id" : 1 } }, { "$project" : { "x" : 1 } } ]) The explain for the query above will help you see the redundant $project.

      Here's an example where there are two $projects in an agg pipeline against a time-series view.

      > coll.explain().aggregate([ { “$sort” : { “_id” : 1 } }, { “$project” : { “x” : 1 } } ])
      {
      	“explainVersion” : “1",
      	“stages” : [
      		{
      			“$cursor” : {
      				“queryPlanner” : {
      					“namespace” : “test.system.buckets.timeseries_bucket_limit_count_1”,
      					“indexFilterSet” : false,
      					“parsedQuery” : {
      					},
      					“queryHash” : “8B3D4AB8”,
      					“planCacheKey” : “8B3D4AB8”,
      					“maxIndexedOrSolutionsReached” : false,
      					“maxIndexedAndSolutionsReached” : false,
      					“maxScansToExplodeReached” : false,
      					“winningPlan” : {
      						“stage” : “COLLSCAN”,
      						“direction” : “forward”
      					},
      					“rejectedPlans” : [ ]
      				}
      			}
      		},
      		{
      			“$_internalUnpackBucket” : {
      				“include” : [
      					“_id”,
      					“x”
      				],
      				“timeField” : “time”
      			}
      		},
      		{
      			“$project” : {
      				“_id” : true,
      				“x” : true
      			}
      		},
      		{
      			“$sort” : {
      				“sortKey” : {
      					“_id” : 1
      				}
      			}
      		},
      		{
      			“$project” : {
      				“_id” : true,
      				“x” : true
      			}
      		}
      	],
      	“serverInfo” : {
      		“host” : “cedar”,
      		“port” : 27017,
      		“version” : “4.9.0-alpha4-383-g142e95a”,
      		“gitVersion” : “142e95a1dc2a2ba9c90ea610c30ee697e880e967”
      	},
      	“command” : {
      		“aggregate” : “system.buckets.timeseries_bucket_limit_count_1",
      		“pipeline” : [
      			{
      				“$_internalUnpackBucket” : {
      					“timeField” : “time”,
      					“exclude” : [ ]
      				}
      			},
      			{
      				“$sort” : {
      					“_id” : 1
      				}
      			},
      			{
      				“$project” : {
      					“x” : 1
      				}
      			}
      		],
      		“cursor” : {
      		},
      		“collation” : {
      			“locale” : “simple”
      		}
      	},
      	“ok” : 1
      }
      

      The new project which appears first is internalized but isn't removed from the pipeline.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            eric.cox@mongodb.com Eric Cox (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: