-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Engines
-
StorEng - Defined Pipeline
As part of CLOUDP-300154, to improve backup restore observability, the Atlas Backups team will need performance statistics on WT startup/shutdown.
The current plan to retrieve these statistics is to parse the mongodb logs for log ID:
22430
The changes we'd like are:
- Add a structured "Statistics" log with key stats
- The keys should be in camel case and include the unit of measurement
- The values should be numerical
- Have a separate logId for WT shutdown or have a separate field in the structured log indicating shutdown
Change Examples
startup:
Before:
{ "t": { "$date": "2025-02-11T18:14:17.325+00:00" }, "s": "I", "c": "WTRECOV", "id": 22430, "ctx": "initandlisten", "msg": "WiredTiger message", "attr": { "message": { "ts_sec": 1739297657, "ts_usec": 325447, "thread": "114997:0xffffadaf6010", "session_name": "txn-recover", "category": "WT_VERB_RECOVERY_PROGRESS", "category_id": 30, "verbose_level": "DEBUG_1", "verbose_level_id": 1, "msg": "recovery was completed successfully and took 359ms, including 350ms for the log replay, 0ms for the rollback to stable, and 8ms for the checkpoint." } } }
After:
{ "t": { "$date": "2025-02-11T18:14:17.325+00:00" }, "s": "I", "c": "WTRECOV", "id": 22430, "ctx": "initandlisten", "msg": "WiredTiger message", "attr": { "message": { "ts_sec": 1739297657, "ts_usec": 325447, "thread": "114997:0xffffadaf6010", "session_name": "txn-recover", "category": "WT_VERB_RECOVERY_PROGRESS", "category_id": 30, "verbose_level": "DEBUG_1", "verbose_level_id": 1, "successful" : true, "msg": "recovery was completed successfully and took 359ms, including 350ms for the log replay, 0ms for the rollback to stable, and 8ms for the checkpoint." "Statistics" : { "totalWtRecoveryDurationMilliseconds" : 359, "oplogReplayDurationMilliseconds" : 350, "rollbackToStableDurationMilliseconds" : 0, "checkpointDurationMilliseconds": 8 } } } }
Shutdown
before:
{ "t": { "$date": "2025-02-27T20:20:30.248+00:00" }, "s": "I", "c": "WTRECOV", "id": 22430, "ctx": "thread46", "msg": "WiredTiger message", "attr": { "message": { "ts_sec": 1740687630, "ts_usec": 248227, "thread": "2344:0xffff74332780", "session_name": "WT_CONNECTION.close", "category": "WT_VERB_RECOVERY_PROGRESS", "category_id": 34, "verbose_level": "DEBUG_1", "verbose_level_id": 1, "msg": "shutdown was completed successfully and took 124ms, including 20ms for the rollback to stable, and 96ms for the checkpoint." } } }
after:
{ "t": { "$date": "2025-02-27T20:20:30.248+00:00" }, "s": "I", "c": "WTRECOV", "id": 22431, "ctx": "thread46", "msg": "WiredTiger message", "attr": { "message": { "ts_sec": 1740687630, "ts_usec": 248227, "thread": "2344:0xffff74332780", "session_name": "WT_CONNECTION.close", "category": "WT_VERB_RECOVERY_PROGRESS", "category_id": 34, "verbose_level": "DEBUG_1", "verbose_level_id": 1, "successful" : true, "msg": "shutdown was completed successfully and took 124ms, including 20ms for the rollback to stable, and 96ms for the checkpoint." "Statistics" : { "totalWtShutdownDurationMilliseconds" : 124, "rollbackToStableDurationMilliseconds" : 20, "checkpointDurationMilliseconds": 96 } } } }
- is depended on by
-
SERVER-101958 Add fields for improved startup/shutdown performance logs
-
- Backlog
-