Mongo DB aggregation Query Performance

XMLWordPrintableJSON

    • Server Triage
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Dear Team,

      I have document structure some thing like below

      // Some comments here
      { 
          "_id" : ObjectId("5ac05dc58ff6cd3054d5654c"), 
          "origin" : {
              "satellite" : [
                  "HuJpO", 
                  "rBQXN"
              ], 
              "code" : "0000", 
              "lat" : 0.10894239962556962, 
              "long" : 0.3182917328356123, 
              "state" : "7MJA8", 
              "mother" : "RNzTT"
          }, 
          "destination" : {
              "satellite" : [
                  "SFrLs", 
                  "PB22x"
              ], 
              "code" : "0001", 
              "lat" : 0.9059857126884281, 
              "long" : 0.47366986861781046, 
              "state" : "iGPdL", 
              "mother" : "rmL2i"
          }, 
          "currentOutput" : {
              "originprops" : {
                  "terminalType" : "7Vsdd", 
                  "region" : "Wq13p"
              }, 
              "destprops" : {
                  "terminalType" : "Lj9MU", 
                  "region" : "FsveA"
              }, 
              "metricsOnRoute" : {
                  "wtOnRoute" : 0.12848451356908464, 
                  "cubeOnRoute" : 0.36253653173815015, 
                  "packagesOnRoute" : 0.7116535335079031
              }, 
              "ftlMetricsOnRoute" : {
                  "wtOnRoute" : 0.2867262438925031, 
                  "cubeOnRoute" : 0.041862427079534026, 
                  "packagesOnRoute" : 0.1922009517509271
              }, 
              "loadPlan" : [
                  {
                      "origin" : {
                          "satellite" : [
                              "Eeu64", 
                              "UoBZg"
                          ], 
                          "code" : "0895", 
                          "lat" : 0.886633114334241, 
                          "long" : 0.7648130191577616, 
                          "state" : "e6YVK", 
                          "mother" : "seZRD", 
                          "region" : "YH3l0", 
                          "terminalType" : "6eRPl"
                      }, 
                      "destination" : {
                          "satellite" : [
                              "HsXLj", 
                              "3owI3"
                          ], 
                          "code" : "0582", 
                          "lat" : 0.2404112136392642, 
                          "long" : 0.4453750293428518, 
                          "state" : "24y3J", 
                          "mother" : "Al228", 
                          "region" : "BnOho", 
                          "terminalType" : "J5bzc"
                      }, 
                      "_id" : ObjectId("5ac05dc58ff6cd3054d56550")
                  }, 
                  {
                      "origin" : {
                          "satellite" : [
                              "dsP96", 
                              "kxB4x"
                          ], 
                          "code" : "0889", 
                          "lat" : 0.7451116916804306, 
                          "long" : 0.5827170158371271, 
                          "state" : "rlab1", 
                          "mother" : "kAdCk", 
                          "region" : "K4Fmv", 
                          "terminalType" : "1Rmq9"
                      }, 
                      "destination" : {
                          "satellite" : [
                              "f9Dsw", 
                              "nWtR1"
                          ], 
                          "code" : "0143", 
                          "lat" : 0.6918117740443803, 
                          "long" : 0.6534881444120195, 
                          "state" : "M8rJ9", 
                          "mother" : "SAIay", 
                          "region" : "gj2GK", 
                          "terminalType" : "WxFMb"
                      }, 
                      "_id" : ObjectId("5ac05dc58ff6cd3054d5654f")
                  }, 
                  {
                      "origin" : {
                          "satellite" : [
                              "OEMn4", 
                              "L06I7"
                          ], 
                          "code" : "0019", 
                          "lat" : 0.2928144478088912, 
                          "long" : 0.48588766114875814, 
                          "state" : "CiZfK", 
                          "mother" : "JhK2x", 
                          "region" : "30LxB", 
                          "terminalType" : "2T5h2"
                      }, 
                      "destination" : {
                          "satellite" : [
                              "MYlrI", 
                              "chBxa"
                          ], 
                          "code" : "0127", 
                          "lat" : 0.1475212251052762, 
                          "long" : 0.5700208781264593, 
                          "state" : "eq4Tz", 
                          "mother" : "VWGka", 
                          "region" : "4haP1", 
                          "terminalType" : "hPoJN"
                      }, 
                      "_id" : ObjectId("5ac05dc58ff6cd3054d5654e")
                  }, 
                  {
                      "origin" : {
                          "satellite" : [
                              "Dn5gk", 
                              "E8ioC"
                          ], 
                          "code" : "0261", 
                          "lat" : 0.5357177690737176, 
                          "long" : 0.24077439863344718, 
                          "state" : "ifZma", 
                          "mother" : "Qw3JQ", 
                          "region" : "535wA", 
                          "terminalType" : "OWDMq"
                      }, 
                      "destination" : {
                          "satellite" : [
                              "KMC4g", 
                              "TRM2g"
                          ], 
                          "code" : "0364", 
                          "lat" : 0.0853362441968799, 
                          "long" : 0.17485240248316103, 
                          "state" : "OT3OB", 
                          "mother" : "4ziG5", 
                          "region" : "r5QXn", 
                          "terminalType" : "BZr3o"
                      }, 
                      "_id" : ObjectId("5ac05dc58ff6cd3054d5654d")
                  }
              ], 
              "avgMiles" : 0.15093020854848138, 
              "avgDelay" : 0.8286008015547, 
              "avgEarlyArrival" : 0.49055776886519387, 
              "delayPenalityCost" : 0.36795929885604606, 
              "percentWrDelay" : 0.17503869241793435, 
              "percentWtOnRail" : 0.06687894944310369, 
              "numberOfRehandlings" : 0.7399530304098465, 
              "numberOfRelay" : 0.6613467970393607, 
              "loadPlanString" : "0895-0889-0019-0261-0364", 
              "dispatchPlanString" : "0895-0889-0019-0261-0364", 
              "loadPlanStringWithType" : "0895-0889-0019-0261-0364"
          }, 
          "scenarioId" : NumberInt(0), 
          "serviceType" : "ECON", 
          "__v" : NumberInt(0)
      }
      

      This collection could potentially hold millions of data per scenario.
      In my aggregation pipeline post i apply scenario filter potentially i can pass millions of data to next stage of the pipeline.

      currently we have a limit of 100MB and i need to allowdiskuse, and i get the result back in 100+ secs.

      My query looks like below.

      Unable to find source-code formatter for language: shell. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      // Some comments here
      db.servicestats.aggregate([{
          $match: {
              scenarioId: { $in: [ 0, 1 ] }
          }
      }, 
      {
          $sort: { 
              'origin.code': 1,
              'destination.code': 1,
              serviceType: 1
          }
      },
       {
          $group: { 
              _id: {
                  originCode: '$origin.code',
                  destinationCode: '$destination.code',
                  serviceType: '$serviceType'
              },
              baseScenarioAvgMiles: {
                  $max: {
                      $cond: {
                          if: { $eq: [ '$scenarioId', 1 ] },
                          then: '$currentOutput.avgMiles',
                          else: 0
                      }
                  }
              },
              compareScenarioAvgMiles: {
                  $max: {
                      $cond: {
                          if: { $eq: [ '$scenarioId', 0 ] },
                          then: '$currentOutput.avgMiles',
                          else: 0
                      }
                  }
              }
          },
      },
       {
          $addFields: { // compute the difference
              diff: {
                  $subtract :[ '$baseScenarioAvgMiles', '$compareScenarioAvgMiles']
              }
          }
      }, 
      {
          $match: {
              'diff': { $gt: 0.5 }
          }
      }, 
      {
          $limit:100
      }],{allowDiskUse:false})
      

      Any suggestions so that i can improve my query performance? Currently i have indexes on my group condition and also on scenarioId.

      In the document i would need all fields as i can potentially show all columns on UI and filter on any them.

            Assignee:
            [HELP ONLY] Backlog - Triage Team
            Reporter:
            srivani kambhampati
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: