[SERVER-21070] Add option to gather collection stats Created: 22/Oct/15  Updated: 05/Feb/22  Resolved: 13/Dec/21

Status: Closed
Project: Core Server
Component/s: Diagnostics
Affects Version/s: 3.2.0
Fix Version/s: 5.3.0, 5.0.7

Type: New Feature Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Mark Benvenuto
Resolution: Done Votes: 0
Labels: FTDC, SWDI, move-sec
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0
Sprint: Security 2021-09-06, Security 2021-09-20, Security 2021-10-04, Security 2021-10-18, Security 2021-11-01, Security 2021-11-15, Security 2021-11-29, Security 2021-12-13
Participants:

 Description   

It would be useful for debugging certain kinds of issues to gather stats() from arbitrary collections along with serverStatus() etc using the "Full Time Data Capture" system (see: SERVER-19585). This would be enabled and disabled by a run-time option.



 Comments   
Comment by Githook User [ 04/Feb/22 ]

Author:

{'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}

Message: SERVER-21070 Add option to FTDC to gather collection stats

(cherry picked from commit d2b21149224344ac9ebbed560caeecfa96eeb613)
Branch: v5.0
https://github.com/mongodb/mongo/commit/ee29d6bace72daae188b931668af2f7416c5c5e7

Comment by Githook User [ 13/Dec/21 ]

Author:

{'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}

Message: SERVER-21070 Add option to FTDC to gather collection stats
Branch: master
https://github.com/mongodb/mongo/commit/d2b21149224344ac9ebbed560caeecfa96eeb613

Comment by Judah Schvimer [ 25/Aug/21 ]

In talking with bruce.lucas about the mongos vs. mongod experience, we decided that we will error if a user tries to set the server parameter on mongos, and only allow it on mongods. For a sharded cluster the collStats data will exist in each mongod's FTDC for the data that mongod stores. If in the future we want to support collStats data in mongos FTDC we'll have to decide if and how to filter out the data, since the output of collStats on mongos can get quite large.

Comment by Judah Schvimer [ 24/Aug/21 ]

We'll model the command line parameter off of authenicationMechanisms which accepts an array as an input as a comma-separated list. We'll have to test how a comma in a namespace behaves and decide what acceptable behavior is.

Comment by Bruce Lucas (Inactive) [ 24/Aug/21 ]

Consensus of the Triage team was that the setParameter version above seems more useful as 1) it interacts better with server management systems (e.g. Atlas, DSI, customer systems) and 2) often we just are investigating one collection anyway. It would be good if it were flexible and allowed an array for the setParameter and config file versions; for the command-line version something like a comma-separated list as you suggested seems fine.

Comment by Bruce Lucas (Inactive) [ 17/Aug/21 ]

Yes I think that would be ok. Not sure about the syntax for a list in command line (and yaml) either - is comma a permissible character in a collection name?

Another possibility might be a separate command:

db.adminCommand({enableFtdcCollectionStats: "db.coll"})
db.adminCommand({disableFtdcCollectionStats: "db.coll"})

This would make it a bit easier to enable or disable individual collections and would sidestep the issue of specifying lists, but disadvantage is you couldn't set it in the config file.

Comment by Judah Schvimer [ 17/Aug/21 ]

Would something like the following work (not sure of the exact array syntax in a command line setParameter)?

mongod --setParameter ftdcCollectionStatsNamespaces="db1.coll1","db2.coll2",...,"dbN.collN"
and
db.adminCommand( { setParameter: 1, ftdcCollectionStatsNamespaces: ["db1.coll1","db2.coll2",...,"dbN.collN"]  } )

Comment by Bruce Lucas (Inactive) [ 17/Aug/21 ]

The general idea is for FTDC to optionally run collStats on a (likely small) list of collections and include it in FTDC output, in a manner similar to the way it currently runs collStats on the oplog collection.

I think the machinery is already in place to do this since it's already done for the oplog, so it's mostly a matter of plumbing through the options. Ideally it would be runtime configurable because it will considerably inflate FTDC and could have some performance impact so most likely would only be enabled for the duration of a test or an issue under investigation. Conceptually this is a (probably short) list of namespaces to enable collection for; not sure if that list would be set by a single setParameter, or maybe better if collection for each namespace could be enabled or disabled by an individual setParameter.

Is that enough detail for now?

Comment by Judah Schvimer [ 17/Aug/21 ]

bruce.lucas, can you please specify what this feature should look like, how it should be configured, what it should include, etc.?

Comment by Mark Benvenuto [ 22/Oct/15 ]

While there is definitely an opportunity to evolve this into a more general purpose metric logging component, we do not have time to do this until a later release. We also have to decide what fits into the core of the product, and what belongs external to it. If it goes into MongoD, then we need to consider how to expose configuration of the system, etc. There are also unanswered questions around who, and what consumes this information for users.

Comment by Bruce Lucas (Inactive) [ 22/Oct/15 ]

The idea is to collect, compress, and store this along with other ftdc data in $dbpath/diagnostics.data (sorry I wasn't clearer about that) for easy and space-efficient operation at a customer site, and an external tool wouldn't accomplish that. Also it seems like an easy add-on, leveraging the existing ftdc infrastructure. I would imagine that it would be manually configured separately for each node of a replica set and each shard; we could consider automatically propagating this across a cluster, but my initial thought is that's an unnecessary complication. It would be analyzed and visualized by the tooling we are developing for ftdc data. dan@10gen.com and mark.benvenuto I think will understand how this request relates to our new ftdc capabilities in 3.2.

Comment by Scott Hernandez (Inactive) [ 22/Oct/15 ]

It seems like anything that isn't system wide, and automatic, should really be an external tool, since manual configuration is needed either way.

Bruce, can you provide an example of how you expect this feature to surface and to be used? Also, how would it work with anything more than a single server, like in the context of replication and sharding?

Generated at Thu Feb 08 03:56:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.