[DOCS-9502] Docs for SERVER-21818: Capture system metrics in FTDC Created: 05/Dec/16  Updated: 30/Oct/23

Status: Closed
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: Server_Docs_20231030

Type: Task Priority: Major - P3
Reporter: Emily Hall Assignee: Kay Kim (Inactive)
Resolution: Won't Do Votes: 0
Labels: monitoring
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-21818 Capture system metrics in FTDC Closed
Related
Participants:
Days since reply: 1 year, 14 weeks, 2 days ago
Epic Link: DOCSP-1769

 Description   

Engineering Ticket Description:

Currently full-time data capture only includes internal metrics (with a small number of exceptions). It would be useful to also capture system metrics related to cpu, memory, and storage. For illustrative purposes attached is a POC data capture tool sysmon.py that captures such information on Linux from /proc/stat, /proc/meminfo, and /sys/block/*/stat that has proven useful for problem diagnosis. Captured information includes the following:

/proc/stat
cpu_user
cpu_nice
cpu_system
cpu_idle
cpu_iowait
cpu_irq
cpu_softirq
cpu_steal
cpu_guest
cpu_guest_nice
ctxt
btime
processes
procs_running
procs_blocked
cpus

/proc/meminfo
memtotal
memfree
buffers
cached
swapcached
active
inactive
active anon
inactive anon
active file
inactive file
dirty

/sys/block/*/stat
sd*.reads
sd*.reads_merged
sd*.read_sectors
sd*.read_time_ms
sd*.writes
sd*.writes_merged
sd*.write_sectors
sd*.write_time_ms
sd*.io_in_progress
sd*.io_time_ms
sd*.io_queued_ms

Similar metrics are available through Windows APIs. Where applicable cumulative counters are preferred over instantaneous values because cumulative counters can be sampled at arbitrary time intervals. In general raw system-specific metrics with a minimum of processing are preferred, leaving it to tooling to subsample as needed and compute useful values for display. (An exception might be for example that sectors could be converted to bytes because sector may be a system- or device-specific unit.)



 Comments   
Comment by Education Bot [ 31/Oct/22 ]

Hello! This ticket has been closed due to inactivity. If you believe this ticket is still important, please reopen it and leave a comment to explain why. Thank you!

Generated at Thu Feb 08 07:58:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.