[SERVER-45032] Allow $planCacheStats to target every shardsvr node in a sharded cluster Created: 09/Dec/19  Updated: 29/Oct/23  Resolved: 19/Apr/23

Status: Closed
Project: Core Server
Component/s: Aggregation Framework, Sharding
Affects Version/s: None
Fix Version/s: 7.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: David Storch Assignee: Ivan Fefer
Resolution: Fixed Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-73557 Add ability for mongos to broadcast t... Closed
is depended on by COMPASS-6739 Investigate changes in SERVER-45032: ... Closed
Documented
is documented by DOCS-16051 Investigate changes in SERVER-45032: ... Closed
Related
related to SERVER-51117 report index information on a per-sha... Closed
is related to SERVER-45658 Allow $planCacheStats to target a spe... Backlog
is related to SERVER-34633 Allow $currentOp to retrieve operatio... Closed
is related to SERVER-44823 Sharding support for $planCacheStats Closed
Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Participants:

 Description   

In SERVER-44823, we improved $planCacheStats to be able to gather plan cache metadata from all of the shards in a sharded cluster. However, this operation uses the normal host targeting rules for selecting a single node in each shard. Unlike regular data, the plan cache metadata is not replicated and is local to each node. Rather than choose a single node from each shard, $planCacheStats should be capable of targeting every data-bearing node in the cluster—that is, every node in every shard, excluding the config servers.

Achieving this behavior may require some work in the underlying sharding infrastructure, since I'm not aware of any other pre-existing sharded operation that targets every node in the cluster. Also, this could be a very expensive operation for large sharded clusters, so we should consider having users opt into this behavior explicitly, perhaps with a new readPreference setting or with an explicit flag on the $planCacheStats operation.

Note that SERVER-34633 tracks a very similar improvement for the $currentOp agg stage.



 Comments   
Comment by Githook User [ 19/Apr/23 ]

Author:

{'name': 'Ivan Fefer', 'email': 'ivan.fefer@mongodb.com', 'username': 'Fefer-Ivan'}

Message: SERVER-45032 Add allHosts option to $planCacheStats to get info from all hosts in a shard
Branch: master
https://github.com/mongodb/mongo/commit/6fba90220e41744b65979950c01dea69fe39bb7c

Comment by Eric Milkie [ 14/Apr/23 ]

And yes, I think the only way to implement this for every node in a sharded cluster would be to implement it for replica sets. This will become moot once we make all replica sets single shards, anyway.

Comment by Eric Milkie [ 14/Apr/23 ]

Actually I think that section of the docs is a disclaimer, warning users about an unexpected behavior.
In practice, it would be difficult and cumbersome to fetch the cache stats from every node just using read preference and replica set tags.

Comment by Ivan Fefer [ 14/Apr/23 ]

In docs we state that users should use readPreference to get plan cache from different parts of replica set: https://www.mongodb.com/docs/manual/reference/operator/aggregation/planCacheStats/#read-preference 

Comment by Ivan Fefer [ 14/Apr/23 ]

Should this ticket also affect $planCacheStats for replica sets?

Comment by Davis Haupt (Inactive) [ 10/Feb/23 ]

Flagging for scheduling because after SERVER-73557 completed, this will likely be a good quick win nomination.

Comment by Eric Milkie [ 09/Aug/22 ]

Note that $indexStats has the same issue and could benefit from this work as well.

Generated at Thu Feb 08 05:07:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.