[SERVER-44823] Sharding support for $planCacheStats Created: 25/Nov/19 Updated: 29/Oct/23 Resolved: 09/Dec/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.3 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | David Storch | Assignee: | David Storch |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | qexec-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Sprint: | Query 2019-12-16 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Description |
|
Currently, an aggregate operation which reads the plan cache as a "virtual collection" using $planCacheStats is not supported when connected to mongos:
Instead, users are expected to connect directly to the mongod of interest in order to examine that node's plan cache. In some environments (e.g. Atlas), it may be difficult or discouraged to connect directly to a shardsvr. Furthermore, some users may wish to examine the plan caches on all nodes before drilling down into particular nodes of interest. Therefore, we should add support for $planCacheStats issued via a mongos. The most sensible behavior for such an operation would be to return the union of the plan cache entries from every shardsvr node in the cluster (as opposed to obeying the read preference and returning the plan caches for a particular node in each shard). This may require some work in the sharding infrastructure to allow an aggregate operation to target every node. The current infrastructure typically assumes that at most one host in each shard is targeted. Finally, in order to allow users to filter, sort, group, etc. based on the host, we should augment each plan cache entry document in the result set with host:port information in the case of a sharded $planCacheStats. |
| Comments |
| Comment by Githook User [ 09/Dec/19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'David Storch', 'username': 'dstorch', 'email': 'david.storch@mongodb.com'}Message: When a $planCacheStats pipeline is delivered to a mongos, it Allowing clients to collect plan cache information from | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 09/Dec/19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The changes planned for this ticket will allow $planCacheStats to be issued against a mongos, but will not implement behavior in which the mongos targets every data-bearing node in the cluster. Instead, mongos will target a single host from every shard using the normal read preference rules for host selection. I have filed This change adds a new "host" field to every document returned by $planCacheStats, which will contain the "host:port" string of the mongod from which the cache entry document originated. This makes it easy for users to understand which node's cache they are reading when connected to a replica set, and also disambiguates cache entries which may have come from different hosts when connected to a mongos. When the $planCacheStats operation is run through a mongos, each cache entry document will additionally contain a "shard" field which contains the shard name from which the document originated. This can be used in a similar fashion to "host". For example, $planCacheStats queries can use MQL to sort, group, or filter the results by shard name. The following example demonstrates what $planCacheStats output looks like in a sharded scenario. I created a collection testDb.source which has chunks on two shards:
Next, created indexes and ran a query in order to produce a plan cache entry on the primary node of each shard:
Finally, I ran a query that returns the plan cache entries from the primary node of both shards. In particular, note the values of the "host" and "shard" fields for each of the two result documents:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kevin Pulo [ 26/Nov/19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
$currentOp could also benefit from being able to target all data-bearing members (see |