Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-101113

$unionWith subpipeline on time-series collections performs collscan instead of using indexes

    • Query Integration
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Came from HELP-71156. Full repro in comments. $unionWith correctly uses an index when run against a standard (non-time-series) collection: 

      db.waterMeasurements.createIndex(
          { "metadata.systemId": 1, "metadata.dataType": 1, "metadata.index": 1, "timestamp": 1 });

       

      // pipeline
      {
                  "$unionWith": {
                      "coll": waterMeasurements,
                      "pipeline": [
                          {
                              "$match": {
                                  "metadata.systemId": 5578,
                                  "metadata.dataType": "DHW_TANK_TEMPERATURE",
      ...
      // explain("executionStats") for the normal collection.
      {
          "$unionWith": {
              "coll": "waterMeasurements",
              "pipeline": [
                  {
                      "$cursor": {
                          "queryPlanner": {
                              "namespace": "test.waterMeasurements",
                              ....
                              "winningPlan": {
                                  ...
                                      "inputStage": {
                                          "stage": "FETCH",
                                          "inputStage": {
                                              "stage": "IXSCAN",
                                              "keyPattern": {
                                                  "metadata.systemId": 1,
                                                  "metadata.dataType": 1,
                                                  "metadata.index": 1,
                                                  "timestamp": 1
      ...

      Whereas for some reason, the same $unionWith query on a time-series collection shows that a COLLSCAN is performed

      db.waterMeasurementsTS.createIndex(     { "metadata.systemId": 1, "metadata.dataType": 1, "metadata.index": 1, "timestamp": 1 }); 
      // explain("executionStats") for a time-series collection.
      {
          "$unionWith": {
              "coll": "system.buckets.waterMeasurementsTS",
              "pipeline": [
                  {
                      "$cursor": {
                          "queryPlanner": {
                             "namespace": "test.system.buckets.waterMeasurementsTS",
                              ...
                              "winningPlan": {
                                      ...
                                      "inputStage": {
                                          "stage": "COLLSCAN",

      The COLLSCAN shouldn't be necessary since we can use the index we created. We also could have used the automatically-created compound index on metaField and the timeField to skip ineligible buckets.

      We should change this to use an existing index when inside a $unionWith subpipeline to skip unnecessary processing. 

       

       

       

            Assignee:
            Unassigned Unassigned
            Reporter:
            erin.liang@mongodb.com Erin Liang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None