Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18940

Optimise sharded aggregations that are targeted to a single shard

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      sh.enableSharding("test")
      sh.shardCollection("test.test", { a: 1, b: 1, _id: 1 } )
      sh.splitAt("test.test", { a: 0, b: 0, _id: MinKey } )
      

      Then observe that the following aggs are each targeted to just one shard, but still have an unnecessarily split pipeline:

      > db.test.aggregate([ { $match: { a: -1, b: -1 } } ], {explain: true})
      {
              "splitPipeline" : {
                      "shardsPart" : [
                              {
                                      "$match" : {
                                              "a" : -1,
                                              "b" : -1
                                      }
                              }
                      ],
                      "mergerPart" : [ ]
              },
              "shards" : {
                      "shard02" : {
                              "host" : "genique:20085",
                              "stages" : [
                                      {
                                              "$cursor" : {
                                                      "query" : {
                                                              "a" : -1,
                                                              "b" : -1
                                                      },
                                                      "queryPlanner" : {
                                                              "plannerVersion" : 1,
                                                              "namespace" : "test.test",
                                                              "indexFilterSet" : false,
                                                              "parsedQuery" : {
                                                                      "$and" : [
                                                                              {
                                                                                      "a" : {
                                                                                              "$eq" : -1
                                                                                      }
                                                                              },
                                                                              {
                                                                                      "b" : {
                                                                                              "$eq" : -1
                                                                                      }
                                                                              }
                                                                      ]
                                                              },
                                                              "winningPlan" : {
                                                                      "stage" : "FETCH",
                                                                      "inputStage" : {
                                                                              "stage" : "SHARDING_FILTER",
                                                                              "inputStage" : {
                                                                                      "stage" : "IXSCAN",
                                                                                      "keyPattern" : {
                                                                                              "a" : 1,
                                                                                              "b" : 1,
                                                                                              "_id" : 1
                                                                                      },
                                                                                      "indexName" : "a_1_b_1__id_1",
                                                                                      "isMultiKey" : false,
                                                                                      "indexVersion" : 1,
                                                                                      "direction" : "forward",
                                                                                      "indexBounds" : {
                                                                                              "a" : [
                                                                                                      "[-1.0, -1.0]"
                                                                                              ],
                                                                                              "b" : [
                                                                                                      "[-1.0, -1.0]"
                                                                                              ],
                                                                                              "_id" : [
                                                                                                      "[MinKey, MaxKey]"
                                                                                              ]
                                                                                      }
                                                                              }
                                                                      }
                                                              },
                                                              "rejectedPlans" : [ ]
                                                      }
                                              }
                                      }
                              ]
                      }
              },
              "ok" : 1
      }
      > db.test.aggregate([ { $match: { a: 1, b: 1 } } ], {explain: true})
      {
              "splitPipeline" : {
                      "shardsPart" : [
                              {
                                      "$match" : {
                                              "a" : 1,
                                              "b" : 1
                                      }
                              }
                      ],
                      "mergerPart" : [ ]
              },
              "shards" : {
                      "shard01" : {
                              "host" : "genique:20084",
                              "stages" : [
                                      {
                                              "$cursor" : {
                                                      "query" : {
                                                              "a" : 1,
                                                              "b" : 1
                                                      },
                                                      "queryPlanner" : {
                                                              "plannerVersion" : 1,
                                                              "namespace" : "test.test",
                                                              "indexFilterSet" : false,
                                                              "parsedQuery" : {
                                                                      "$and" : [
                                                                              {
                                                                                      "a" : {
                                                                                              "$eq" : 1
                                                                                      }
                                                                              },
                                                                              {
                                                                                      "b" : {
                                                                                              "$eq" : 1
                                                                                      }
                                                                              }
                                                                      ]
                                                              },
                                                              "winningPlan" : {
                                                                      "stage" : "FETCH",
                                                                      "inputStage" : {
                                                                              "stage" : "SHARDING_FILTER",
                                                                              "inputStage" : {
                                                                                      "stage" : "IXSCAN",
                                                                                      "keyPattern" : {
                                                                                              "a" : 1,
                                                                                              "b" : 1,
                                                                                              "_id" : 1
                                                                                      },
                                                                                      "indexName" : "a_1_b_1__id_1",
                                                                                      "isMultiKey" : false,
                                                                                      "indexVersion" : 1,
                                                                                      "direction" : "forward",
                                                                                      "indexBounds" : {
                                                                                              "a" : [
                                                                                                      "[1.0, 1.0]"
                                                                                              ],
                                                                                              "b" : [
                                                                                                      "[1.0, 1.0]"
                                                                                              ],
                                                                                              "_id" : [
                                                                                                      "[MinKey, MaxKey]"
                                                                                              ]
                                                                                      }
                                                                              }
                                                                      }
                                                              },
                                                              "rejectedPlans" : [ ]
                                                      }
                                              }
                                      }
                              ]
                      }
              },
              "ok" : 1
      }
      

      and indeed, if the aggs are run without explain, $mergeCursors aggs can be observed in verbose logs on the primary shard.

      Show
      sh.enableSharding("test") sh.shardCollection("test.test", { a: 1, b: 1, _id: 1 } ) sh.splitAt("test.test", { a: 0, b: 0, _id: MinKey } ) Then observe that the following aggs are each targeted to just one shard, but still have an unnecessarily split pipeline: > db.test.aggregate([ { $match: { a: -1, b: -1 } } ], {explain: true}) { "splitPipeline" : { "shardsPart" : [ { "$match" : { "a" : -1, "b" : -1 } } ], "mergerPart" : [ ] }, "shards" : { "shard02" : { "host" : "genique:20085", "stages" : [ { "$cursor" : { "query" : { "a" : -1, "b" : -1 }, "queryPlanner" : { "plannerVersion" : 1, "namespace" : "test.test", "indexFilterSet" : false, "parsedQuery" : { "$and" : [ { "a" : { "$eq" : -1 } }, { "b" : { "$eq" : -1 } } ] }, "winningPlan" : { "stage" : "FETCH", "inputStage" : { "stage" : "SHARDING_FILTER", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "a" : 1, "b" : 1, "_id" : 1 }, "indexName" : "a_1_b_1__id_1", "isMultiKey" : false, "indexVersion" : 1, "direction" : "forward", "indexBounds" : { "a" : [ "[-1.0, -1.0]" ], "b" : [ "[-1.0, -1.0]" ], "_id" : [ "[MinKey, MaxKey]" ] } } } }, "rejectedPlans" : [ ] } } } ] } }, "ok" : 1 } > db.test.aggregate([ { $match: { a: 1, b: 1 } } ], {explain: true}) { "splitPipeline" : { "shardsPart" : [ { "$match" : { "a" : 1, "b" : 1 } } ], "mergerPart" : [ ] }, "shards" : { "shard01" : { "host" : "genique:20084", "stages" : [ { "$cursor" : { "query" : { "a" : 1, "b" : 1 }, "queryPlanner" : { "plannerVersion" : 1, "namespace" : "test.test", "indexFilterSet" : false, "parsedQuery" : { "$and" : [ { "a" : { "$eq" : 1 } }, { "b" : { "$eq" : 1 } } ] }, "winningPlan" : { "stage" : "FETCH", "inputStage" : { "stage" : "SHARDING_FILTER", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "a" : 1, "b" : 1, "_id" : 1 }, "indexName" : "a_1_b_1__id_1", "isMultiKey" : false, "indexVersion" : 1, "direction" : "forward", "indexBounds" : { "a" : [ "[1.0, 1.0]" ], "b" : [ "[1.0, 1.0]" ], "_id" : [ "[MinKey, MaxKey]" ] } } } }, "rejectedPlans" : [ ] } } } ] } }, "ok" : 1 } and indeed, if the aggs are run without explain, $mergeCursors aggs can be observed in verbose logs on the primary shard.
    • Epic Link:
    • Sprint:
      Query 2017-07-10, Query 2017-07-31, Query 2017-08-21

      Description

      The fix implemented on SERVER-7656 doesn't fully address the issue, because it only works with an exact $match on the full shard key. However, it is possible for an aggregation to be directed to a single shard even without this. For example:

      1. a range query that is wholly contained within a single chunk
      2. a prefix match on the shard key, where the trailing fields are wholly contained within a single chunk

      This means that an aggregation may be perfectly targeted to a single shard, but will not be passed through to that shard.

      Case #2 is particularly problematic when _id is appended to the shard key (to guarantee granularity), but is never actually used in queries/aggs (and the rest of the shard key is usually/often enough to target just one shard).

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              bernard.gorman Bernard Gorman
              Reporter:
              kevin.pulo Kevin Pulo
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: