-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication, Testing Infrastructure
-
Server Tooling & Methods
Some repl hangs are due to nodes not replicating rather than an actual deadlock. It would be helpful if the hang analyzer called replSetGetStatus on every node in the cluster while the process was still alive.
If the replSetGetStatus call hangs because of a deadlock on the ReplicationCoordinator mutex, then the replication progress is probably not important anyways so it's not a problem to just kill that command.
- related to
-
SERVER-40857 Remove write concern wtimeouts that are expected to succeed throughout tests
- Backlog