Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-82928

Reconsider mongos approach to only incremental global counters for stages when aggregation is successful

    • Query Integration
    • Fully Compatible

      The call to LiteParsedPipeline::tickGlobalStageCounters() depends on the status from running the pipeline being Status::OK(). This means the aggStageCounters won't be incremented if the aggregation pipeline errors. Not incrementing the aggStageCounters in mongos has made it challenging to identify from which mongos an aggregation a problematic aggregation (SERVER-82410) is being run in hopes to identify the particular application client triggering it (although we suspect Compass). Additionally, it is odd for mongos and mongod to have different behaviors when it comes to system observability.

      There may be downsides to having mongos increment its aggStageCounters even when the aggregation pipeline errors and so it is worth the Query team discussing internally to come to a decision again.

      if (status.isOK()) {
          updateHostsTargetedMetrics(opCtx,
                                      namespaces.executionNss,
                                      cri ? boost::make_optional(cri->cm) : boost::none,
                                      involvedNamespaces);
          // Report usage statistics for each stage in the pipeline.
          liteParsedPipeline.tickGlobalStageCounters();
          // Add 'command' object to explain output.
          if (expCtx->explain) {
              explain_common::appendIfRoom(
                  aggregation_request_helper::serializeToCommandObj(request), "command", result);
          }
      }
      

      https://github.com/mongodb/mongo/blob/60a2f259be89aba194a2537037e464c96245b378/src/mongo/s/query/cluster_aggregate.cpp#L633

            Assignee:
            vamsy.annabattula@mongodb.com Vamsy Annabattula
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: