[SERVER-66495] Aggregations should be counted as Query ops, not Command ops Created: 16/May/22  Updated: 29/Oct/23  Resolved: 25/Jul/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Eric Milkie Assignee: Foteini Alvanaki
Resolution: Fixed Votes: 0
Labels: RDY, neweng, notify-analytics
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by TOOLS-3121 Investigate changes in SERVER-66495: ... Closed
is depended on by TOOLS-3135 Investigate changes in SERVER-66495: ... Closed
Documented
is documented by DOCS-15346 [Server] Investigate changes in SERVE... Closed
Problem/Incident
Related
related to SERVER-71281 opcounters.command on mongos counts i... Closed
related to SERVER-79297 Count and distinct should be counted ... Backlog
Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Sprint: QE 2023-05-15, QE 2023-05-29, QE 2023-06-12, QE 2023-06-26, QE 2023-08-07
Participants:

 Description   

Currently, only the official Find Command increments the "Query" OpCounter from db/stats/counters.h. Any use of the aggregation pipeline command is only counted as a "Command", lumped together with all other operations that aren't Query, writes, or getmores.
Since it would be more useful to have all commands that perform query-like operations be counted as queries, we should start counting aggregations as queries.
(Note that we already count get-mores issued on cursors sourced either by aggregation or by finds, without distinction.)



 Comments   
Comment by Foteini Alvanaki [ 26/Jul/23 ]

jonathan.balsano@mongodb.com that is correct. The only change is in the opCounter accounting. 

Comment by Githook User [ 25/Jul/23 ]

Author:

{'name': 'Foteini Alvanaki', 'email': 'foteini.alvanaki@mongodb.com', 'username': ''}

Message: SERVER-66495 Count aggregate command as query
Branch: master
https://github.com/mongodb/mongo/commit/33e62d960811768d042ae806e6d018b6c4fd830e

Comment by Eric Milkie [ 05/Jul/23 ]

Should we include count and distinct here along with aggregate? I think it makes sense to change all three commands to be counted as "query" instead of "command" (if they aren't already).

Comment by Rushan Chen [ 18/Oct/22 ]

New Hire not here till Dec. Will look then.

Comment by Kyle Suarez [ 07/Oct/22 ]

rushan.chen@mongodb.com, you said this would be for a new hire, I am assigning this to you to for the meantime.

Comment by Rushan Chen [ 30/Aug/22 ]

Keeping this for our new hire.

Comment by Asya Kamsky [ 19/Jul/22 ]

> Will an aggregation with a $out or $merge be counted as a write or a "query"?

Currently it updates command opcounter (proposal will change it to query) and update counter for $merge (because that's how it's implemented), and insert counter for $out. No changes proposed to write part of things.

Comment by Kyle Suarez [ 19/Jul/22 ]

Personally, I would think that it should be counted as both a write and a query. But I'll defer to our venerable product team: CC christopher.harris@mongodb.com, kateryna.kamenieva@mongodb.com, joe.sack@mongodb.com

Comment by Kevin Arhelger [ 19/Jul/22 ]

I don't know if any current users for automated reported in technical services for these metrics. However, I did have on question.

Will an aggregation with a $out or $merge be counted as a write or a "query"?

Comment by Eric Milkie [ 17/May/22 ]

I just checked and I was incorrect about the histograms – they record latencies into "reads", "writes", and "commands" buckets, and the Aggregation pipeline command is hardcoded to go into the "reads" bucket (even if you use it to perform writes, apparently!)

Comment by Kyle Suarez [ 17/May/22 ]

Marked downstream changes as "needed" to request comments from other teams that might be affected.

Comment by Eric Milkie [ 17/May/22 ]

In the Top command stats, it indeed includes readLock and writeLock counters in addition to queries, getmore, insert, update, remove, and commands counters.
In the OpCounters structure, it includes all the same counters except for readLock and writeLock. Interestingly, the Top stats are kept separately, per namespace, whereas the (duplicative) OpCounters are kept only "globally" and "repl", and these are only used to power the respective Server Status sections. The Top stats are also used to calculate the Latency Histograms per namespace, and the histograms do not incorporate the readLock / writeLock counters. We are planning on exposing the histogram data soon so I want to make sure we won't have to document the charts as "aggregation commands are in the commands chart and find commands are in the queries chart".
If we make any changes here, we should change both the Top stats and the OpCounters counters in kind.

Comment by Bruce Lucas (Inactive) [ 17/May/22 ]

This seems logical, but it would be a substantial change in meaning for that counter, affecting for example Atlas users and diagnostic engineers. I think we shouldn't do this lightly.

Note by the way that there is an existing counter "reads" which I think counts both queries and aggregations equally.

Generated at Thu Feb 08 06:05:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.