[SERVER-73524] Report a histogram of error codes rather than just an error counter in serverStatus Created: 01/Feb/23 Updated: 16/Jan/24 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Charlie Swanson | Assignee: | Backlog - Service Architecture |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Service Arch
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
The idea is to replace something like
with something more like
We could do that both for this top level "asserts" section:
But also for the "commands" section, or anywhere else we accumulate errors:
This will help gather more insight into what kinds of things are going wrong. For example if we start to see a lot more of a particular error code after upgrading to a new version. |
| Comments |
| Comment by Shameek Ray [ 28/Feb/23 ] |
|
Thanks louis.williams@mongodb.com. Not sure this needs to be a high priority addition to our backlog, perhaps medium priority in the quick wins mix. Defer to jason.chan@mongodb.com / blake.oler@mongodb.com for further thoughts |
| Comment by Louis Williams [ 09/Feb/23 ] |
|
shameek.ray@mongodb.com, I'm not sure, really. We plan on adding specific counters for specific index build errors, which is a trivial amount of work. Index builds are also very special in the code, and I'm not even sure if this proposal would account for "internal" operations like index builds, or if it would cover only user operations. |
| Comment by Shameek Ray [ 08/Feb/23 ] |
|
louis.williams@mongodb.com - does the current Graceful Handling of Index Builds project add new any metrics as described in this ticket? If not, would such a new histogram of error codes be even more useful upon the completion of Graceful Handling of Index Builds? blake.oler@mongodb.com / jason.chan@mongodb.com - this seems like a good ticket to include into the quick win mix. |
| Comment by Eric Sedor [ 02/Feb/23 ] |
|
Discussing with Bruce and we'd add: Please only add itemized errors when they occur. While this does result in a variable size document, the number of errors we have is low enough and the number of unique errors on a given deployment are low enough that we are not concerned about schema changes affecting retention |
| Comment by Bruce Lucas (Inactive) [ 02/Feb/23 ] |
|
This looks useful, but from a diagnostic perspective I wonder if it might be better to keep the existing overall counters and add a new section alongside the that - e.g errorTypes, failureTypes, assertTypes - with a document like you describe above containing the counts broken out by type. Also, for usability why not use the readable names as the key, e.g. "NamespaceNotFound", instead of the numbers? |
| Comment by Charlie Swanson [ 01/Feb/23 ] |
|
I don't know whose backlog to put this on, guessed service architecture to start. |