[DOCS-10425] Ops manager: Backup job is too busy alert, clarifications needed Created: 23/Jun/17  Updated: 23/Dec/18  Resolved: 23/Dec/18

Status: Closed
Project: Documentation
Component/s: Ops Manager
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Emilio Scalise Assignee: Anthony Sansone (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Participants:
Days since reply: 5 years, 7 weeks, 4 days ago
Epic Link: DOCSP-3127

 Description   

The current docs about the alert "Backup job is too busy" may not be clear enough:

https://docs.opsmanager.mongodb.com/current/reference/alerts/#Backup-job-is-busy-for...

Backup job is busy for...
Available only as a global alert.

Sends an alert when a backup job has taken longer than the time specified. This could occur if you have an overloaded Backup Daemon or blockstore. Check the corresponding job log for error messages. Contact MongoDB Support if you need help interpreting the error message.

steve.briskin clarified it with this comment:

The alert is meant to alert admins when a job is more active than expected. They are then responsible for investigating if this is normal (e.g. temporary spike in activity or known hardware degredation) or not (e.g. unexpected increase in activity or hardware is underprovisioned).
The alert computes the amount of time the daemon spent working on applyOps and snapshot jobs over the last 24 hours and compares to their alert threshold. This alert is always based on a 24 hour period and is independent from the snapshot schedule.

I'd suggest to start a discussion on how to improve the alert description in the Docs.

As a side note the alert text is "Backup job is too busy" and not "Backup job is busy for..." in the current Ops Manager versions. This may be a leftover from previous Ops Manager versions.

Thanks,
Emilio



 Comments   
Comment by Anthony Sansone (Inactive) [ 23/Dec/18 ]

This was fixed in DOCS-10198.

Comment by Tomer Yakir [ 13/Nov/18 ]

Done!

Comment by Anthony Sansone (Inactive) [ 13/Nov/18 ]

tomer.yakir Reopen and assign to me if it is not already. Thanks.

Comment by Tomer Yakir [ 13/Nov/18 ]

Hey tony.sansone,

I see this ticket has been closed as dup of DOCS-10198, but I think we should explicitly mention that the alert computes the amount of time the daemon spent working on applyOps and snapshot jobs over the last 24 hours.

Generated at Thu Feb 08 08:00:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.