[SERVER-25042] Start diagnostic data collection as early as possible Created: 13/Jul/16 Updated: 07/Feb/23 Resolved: 11/Oct/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Diagnostics |
| Affects Version/s: | None |
| Fix Version/s: | 5.1.0-rc1 |
| Type: | Improvement | Priority: | Critical - P2 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Sara Golemon |
| Resolution: | Done | Votes: | 6 |
| Labels: | SWDI, move-sec, platforms-re-triaged | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||||||||||
| Backport Requested: |
v5.0
|
||||||||||||||||||||||||||||
| Sprint: | Security 2021-09-06, Security 2021-09-20, Security 2021-10-04, Security 2021-10-18 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||
| Linked BF Score: | 133 | ||||||||||||||||||||||||||||
| Description |
|
Currently diagnostic data collection does not begin until fairly late in the startup sequence, after some potentially significant activity such as oplog stones and interrupted index builds have been done, so it can't be used to diagnose problems early in the startup sequence. Ideally it should be started as early as possible; one dependency is storage engine initialization since it collects WT internal statistics that won't be available until the storage engine is initialized, so maybe it could go right after storage engine initialization. |
| Comments |
| Comment by Yujin Kang Park [ 12/Jul/22 ] |
|
Requesting backport to v5.0 given that we have TSAN failures (BF-25790) due to race condition accessing flow control stats. If backporting all changes is risky, only the exclusion from TSAN should be backported. |
| Comment by Eric Milkie [ 08/Nov/21 ] |
I would first determine whether we have sufficient FTDC stats already, while RTS is running. Sara, is there example FTDC output when server startup takes a long time due to a lengthy RTS phase? |
| Comment by Githook User [ 12/Oct/21 ] |
|
Author: {'name': 'Sara Golemon', 'email': 'sara.golemon@mongodb.com', 'username': 'sgolemon'}Message: Revert " Partial revert of mongos portions of |
| Comment by Githook User [ 11/Oct/21 ] |
|
Author: {'name': 'Sara Golemon', 'email': 'sara.golemon@mongodb.com', 'username': 'sgolemon'}Message: |
| Comment by Githook User [ 13/Sep/21 ] |
|
Author: {'name': 'Sara Golemon', 'email': 'sara.golemon@mongodb.com', 'username': 'sgolemon'}Message: Revert " This reverts commit b7d29c204f0a4b62fc1d9bccc3ec341bbeed330c. |
| Comment by Sara Golemon [ 13/Sep/21 ] |
|
Reopening due to revert. |
| Comment by Githook User [ 10/Sep/21 ] |
|
Author: {'name': 'Sara Golemon', 'email': 'sara.golemon@mongodb.com', 'username': 'sgolemon'}Message: |
| Comment by Ian Whalen (Inactive) [ 31/Aug/20 ] |
|
this also is something we need to come back to after we resolve |