[SERVER-43617] Add metrics on the mongos to indicate the number of shards targeted for the commands (find, aggregate, etc) Created: 25/Sep/19  Updated: 08/Jan/24  Resolved: 17/Dec/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.3.3, 4.0.25, 4.2.15

Type: Improvement Priority: Major - P3
Reporter: Linda Qin Assignee: Janna Golden
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Documented
is documented by DOCS-14455 Investigate changes in SERVER-43617: ... Closed
Duplicate
is duplicated by SERVER-37782 Would like server stats in mongos on ... Closed
Problem/Incident
Related
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.2, v4.0
Sprint: Sharding 2019-11-18, Sharding 2019-12-02, Sharding 2019-12-16, Sharding 2019-12-30
Participants:
Case:
Linked BF Score: 47

 Description   

Currently the mongos collects the opcounter metrics etc, with which we can get an idea on the incoming operations to the mongos. The mongos dispatch the operations to the shards:

  • If the operation only targets one shard, then the mongos only needs to dispatch it to one shard.
  • However, if the operation is a scatter-gather operation or targets a sub set of shards, then the mongos would need to dispatch the operation to multiple shards. In this case, the mongos is doing more work for such operation than the targeted operation, and there would be more load on the mongos.

Currently we have added a metric to indicate the amount of sharded updates done with only _id in query (SERVER-41184). It would be nice to expose the number of shards targeted for the operations (find, aggregate, etc), so that we can have some insight on the number of outgoing operations from the mongos, to understand the (outgoing) workload on the mongos.



 Comments   
Comment by Githook User [ 11/Jun/21 ]

Author:

{'name': 'Janna Golden', 'email': 'janna.golden@mongodb.com', 'username': 'jannaerin'}

Message: SERVER-43617 Add metrics on the mongos to indicate the number of shards targeted for CRUD and agg commands

(cherry picked from commit 1fe9dfb1ade548488831bf29cdcc57636e5e3b8a)
Branch: v4.2
https://github.com/mongodb/mongo/commit/c6533b269898f6cd077ffca7b2cc9545863313e9

Comment by Githook User [ 13/May/21 ]

Author:

{'name': 'Janna Golden', 'email': 'janna.golden@mongodb.com', 'username': 'jannaerin'}

Message: SERVER-43617 Add metrics on the mongos to indicate the number of shards targeted for CRUD and agg commands

(cherry picked from commit 1fe9dfb1ade548488831bf29cdcc57636e5e3b8a)
Branch: v4.0
https://github.com/mongodb/mongo/commit/9aef163066b2d4197d328fb112307c40f840a541

Comment by Githook User [ 19/Dec/19 ]

Author:

{'name': 'Janna Golden', 'email': 'janna.golden@mongodb.com', 'username': 'jannaerin'}

Message: SERVER-43617 Add tag to num_hosts_targeted_metrics.js
Branch: master
https://github.com/mongodb/mongo/commit/54ed086026aba498ae9d51a0fd5094cc5acf1852

Comment by Githook User [ 17/Dec/19 ]

Author:

{'name': 'Janna Golden', 'email': 'janna.golden@mongodb.com', 'username': 'jannaerin'}

Message: SERVER-43617 Add metrics on the mongos to indicate the number of shards targeted for CRUD and agg commands
Branch: master
https://github.com/mongodb/mongo/commit/1fe9dfb1ade548488831bf29cdcc57636e5e3b8a

Comment by Bruce Lucas (Inactive) [ 24/Oct/19 ]

I agree that it's good to have more data to pinpoint the offending queries, but we also have to think about the impact on ftdc. This takes two forms: volume of data, and schema changes (which drastically impact compression). We should take a look at any design from that perspective.

Comment by Sheeri Cabral (Inactive) [ 24/Oct/19 ]

SGTM: linda.qin what do you think?

Comment by Sheeri Cabral (Inactive) [ 23/Oct/19 ]

In order to be meaningful, there would need to be some correlation between number of shards targeted and type of query. e.g. in the case specified, correlating "distinct" with multiple shard targets (and in this specific comment on HELP-11396, correlating aggregate, distinct and find (actually "each operation") with target # of shards.

I feel like if we just had "# shards vs. # of requests", customers (and we) would be able to deduce that there's a bunch of expensive queries targeting multiple shards, but we wouldn't be able to pinpoint which ones. Having "# shards targeted vs. # of requests for operation X" would be able to show us if operation X often targets one shard or many shards. It still doesn't pinpoint a specific query, but it seems better than just total requests on the mongos vs. total requests on the shards (so if shard requests are 2x mongos requests, we know that on average a query goes to 2 shards).

Comment by Bruce Lucas (Inactive) [ 25/Sep/19 ]

Would it suffice to report the total number of requests dispatched to shards? This can be compared to the total number of requests processed to get an idea how much the node is being impacted by multi-target queries in the aggregate.

Generated at Thu Feb 08 05:03:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.