Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36403

Cluster aggregation error message should indicate which shard(s) raised an error

    • Fully Compatible
    • Query 2018-12-17, Query 2018-12-31

      When a sharded aggregation throws, we don't report from where in the cluster the error was generated. To test this, I wrote a simple $assert stage that always throws.

      mongos> db.runCommand({aggregate: "coll", cursor: {}, pipeline: [{$assert: 1}, {$match: {x: 1}}, {$group: {_id: "$x"}}]})
      {
              "ok" : 0,
              "errmsg" : "throwing from $assert",
              "code" : 50893,
              "codeName" : "Location50893",
              "operationTime" : Timestamp(1533156181, 222),
              "$clusterTime" : {
                      "clusterTime" : Timestamp(1533156243, 3),
                      "signature" : {
                              "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                              "keyId" : NumberLong(0)
                      }
              }
      }
      

      The error message format is the same if I force an assertion in the merger part:

      mongos> db.runCommand({aggregate: "coll", cursor: {}, pipeline: [{$match: {x: 1}}, {$group: {_id: "$x"}}, {$assert: 1}]})
      {
              "ok" : 0,
              "errmsg" : "throwing from $assert",
              "code" : 50893,
              "codeName" : "Location50893",
              "operationTime" : Timestamp(1533156181, 222),
              "$clusterTime" : {
                      "clusterTime" : Timestamp(1533156243, 3),
                      "signature" : {
                              "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                              "keyId" : NumberLong(0)
                      }
              }
      }
      

      We suspect the AsyncResultsMerger converts the AsyncRequestsSender::Response objects from each shard into a status and immediately throws if it's non-OK. However, this is losing important information; we could indicate from which shard the error occurred. It also hides any other errors that might have been collected.

      This has implications for the improved $out project, as a failing sharded $out would not indicate from where the failures occurred, making diagnosis harder.

            Assignee:
            vlad.rachev@mongodb.com Vlad Rachev (Inactive)
            Reporter:
            kyle.suarez@mongodb.com Kyle Suarez
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: