[SERVER-43945] Expose out of order latch acquisitions in serverStatus Created: 10/Oct/19  Updated: 08/Jan/24  Resolved: 16/Dec/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.3.3

Type: Task Priority: Major - P3
Reporter: Ratika Gandhi Assignee: Benjamin Caimano (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Documented
is documented by DOCS-13424 Investigate changes in SERVER-43945: ... Closed
Problem/Incident
Related
related to SERVER-45691 Change Mutex::LockListeners to use a ... Closed
related to SERVER-45168 Create add-only syncronized list type Closed
related to SERVER-45027 Use more descriptive names for MONGO_... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.2
Sprint: Service Arch 2019-10-21, Service Arch 2019-11-04, Service Arch 2019-11-18, Service Arch 2019-12-02, Service Arch 2019-12-16, Service Arch 2019-12-30
Participants:
Linked BF Score: 0

 Description   

Ideally, we would like to serialize Hierarchical Acquisition violations in a way that allows us to investigate and reproduce the violation. To that end, we want to know the Latch name and the ordering of violations. I propose the following opt-in section in serverStatus:

{    ...,
    "latchAnalysis": {
        "hierarchicalAcquitionViolations": {
            "<latchNameHere>": {
                "onAcquire": 2,
                "onRelease": 1
            }
        }
    }
}

The top level "latchAnalysis" allows us to expand this segment with future statistics. The separation of "onAcquire" and "onRelease" into separate monotonically increasing counters will allow us to consider them as a pair to determine the period in which a potential deadlock happened as well as the total violations. This scheme deliberately leaves out the client name.



 Comments   
Comment by Githook User [ 03/Mar/20 ]

Author:

{'name': 'Ben Caimano', 'email': 'ben.caimano@10gen.com'}

Message: SERVER-43945 Expose out of order latch acquisitions in serverStatus

This commit also backports:

SERVER-42897 Validate base-level latches
SERVER-44746 Fix LatchAnalyzerTest
SERVER-44155 Validate a subset of latches of all levels
SERVER-45691 Change Mutex::LockListeners to use a std::vector again
SERVER-45793 Improve mongo::Mutex contract
SERVER-45424 Track local latch::Identities when getTestCommandsEnabled()
SERVER-46041 Add DiagnosticListener/WaitListener LSAN suppressions
SERVER-46461 Make static in getDiagnosticListenerState() immortal to fix destruction order issues during shutdown
SERVER-46197 Make build flag to disable diagnostic latches
SERVER-45276 Release lock before destroying DBClientBases
Branch: v4.2
https://github.com/mongodb/mongo/commit/24b6ee2dd48e3f1cfde1c4d7e2b01bd73921fbad

Comment by Githook User [ 16/Dec/19 ]

Author:

{'name': 'Ben Caimano', 'email': 'ben.caimano@mongodb.com', 'username': 'bcaimano'}

Message: SERVER-43945 Expose out of order latch acquisitions in serverStatus

This review does several related things:

Generated at Thu Feb 08 05:04:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.