Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: 4.1 Desired
Affects Version/s: None
Component/s: Replication, Testing Infrastructure
Labels:

Assigned Teams:

Server Tooling & Methods
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Some repl hangs are due to nodes not replicating rather than an actual deadlock. It would be helpful if the hang analyzer called replSetGetStatus on every node in the cluster while the process was still alive.

If the replSetGetStatus call hangs because of a deadlock on the ReplicationCoordinator mutex, then the replication progress is probably not important anyways so it's not a problem to just kill that command.

related to

SERVER-40857 Remove write concern wtimeouts that are expected to succeed throughout tests

Backlog

Assignee:: Backlog - Server Tooling and Methods (STM) (Inactive)
Reporter:: Judah Schvimer
Participants:: Backlog - Server Tooling and Methods (STM), Judah Schvimer, Lingzhi Deng
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Apr 26 2019 06:15:09 PM UTC
Updated:: Dec 06 2022 03:00:42 AM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates