Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 5.0.6, 5.1.0-rc0
Affects Version/s: None
Component/s: Testing Infrastructure
Labels:

Backwards Compatibility:
Fully Compatible
Backport Requested:

v5.0
Sprint:
STM 2021-06-14
Linked BF Score:
164
Story Points:
2
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Attaching gdb and collecting diagnostics for all processes in a sharded cluster continues to time out after 15 minutes. BF-20581 is a recent example where only 6 of the 9 mongod processes were attached to. Server engineers may end up relying on good luck or having access to multiple occurrences to successfully interpret the cause of a hang.

We should consider 1. reordering the steps in hang analyzer so a core dump can be captured for every mongod process even if the diagnostics against the live process cannot, or 2. we should consider sending a SIGABRT to any process gcore wasn't run on before the 15 minutes expire.

related to

SERVER-72613 Speed up taking core dumps with the hang analyzer

Closed

Assignee:: Mikhail Shchatko
Reporter:: Robert Guo (Inactive)
Participants:: Githook User, Mikhail Shchatko, Robert Guo, Vivian Ge
Votes:: 2 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Apr 19 2021 03:25:57 PM UTC
Updated:: Oct 29 2023 09:54:48 PM UTC
Resolved:: Jun 07 2021 02:15:35 PM UTC
Confidence Status Last Update:: 02/Jun/21 8:03 AM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates