Uploaded image for project: 'Compass '
  1. Compass
  2. COMPASS-6739

Investigate changes in SERVER-45032: Allow $planCacheStats to target every shardsvr node in a sharded cluster

    • Type: Icon: Investigation Investigation
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • No version
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Not Needed

      Original Downstream Change Summary

      $planCacheStats aggregation stage has new parameter: allHosts.
      Default value: false.

      When allHosts is false, $planCacheStats behaves as it used to. It follows read preference and will retrieve plan cache data from only targeted replica set node.
      We have this documented here: https://www.mongodb.com/docs/manual/reference/operator/aggregation/planCacheStats/#read-preference

      Setting allHosts to true only supported in sharded clusters. If {$planCacheStats: {allHosts: true}} will be called on standalone or bare replica set, error will be returned during pipeline parsing:
      https://github.com/mongodb/mongo/blob/master/src/mongo/db/pipeline/document_source_plan_cache_stats.cpp#L64

      When allHosts is true mongos will broadcast $planCacheStats to all nodes, primary and secondaries, for each affected shard (every shard that have at least one chunk from target collection).

      It only works on sharded clusters, because on bare replica set there is no simple way to broadcast queries and then merge cursors. This should change in 8.0.

      For mongosh it may make sense to add options argument to PlanCache.list() function.

      Description of Linked Ticket

      In SERVER-44823, we improved $planCacheStats to be able to gather plan cache metadata from all of the shards in a sharded cluster. However, this operation uses the normal host targeting rules for selecting a single node in each shard. Unlike regular data, the plan cache metadata is not replicated and is local to each node. Rather than choose a single node from each shard, $planCacheStats should be capable of targeting every data-bearing node in the cluster—that is, every node in every shard, excluding the config servers.

      Achieving this behavior may require some work in the underlying sharding infrastructure, since I'm not aware of any other pre-existing sharded operation that targets every node in the cluster. Also, this could be a very expensive operation for large sharded clusters, so we should consider having users opt into this behavior explicitly, perhaps with a new readPreference setting or with an explicit flag on the $planCacheStats operation.

      Note that SERVER-34633 tracks a very similar improvement for the $currentOp agg stage.

            Assignee:
            Unassigned Unassigned
            Reporter:
            backlog-server-pm Backlog - Core Eng Program Management Team
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: