[SERVER-51068] HierarchicalAcquisitionLevelViolation messages are cryptic Created: 19/Sep/20  Updated: 29/Oct/23  Resolved: 06/Nov/20

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Billy Donahue Assignee: Benjamin Caimano (Inactive)
Resolution: Fixed Votes: 0
Labels: servicearch-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-13974 Investigate changes in SERVER-51068: ... Closed
Related
is related to SERVER-52660 Memoize demanagled type names Closed
is related to SERVER-52662 Introduce type-associated mutexes Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Service arch 2020-11-02, Service arch 2020-11-16
Participants:

 Description   

What actually happened here? I can't figure it out. I don't think the code being reported on is incorrect, but maybe it is. I can't tell from this message. The diagnostic feature should be able to explain what its expectations were and how they were unmet.

Clearly there was a deadlock concern that was triggered. I think this message should try to answer some basic questions to be useful. Which locks were involved? What order were they acquired? Why is that a concern? Right now I just have no idea what's wrong.

[js_test:agg_expr_fuzzer-1278c-1600474428419-0] 2020-09-19T00:13:59.732+0000 d20020| 2020-09-19T00:13:59.731+00:00 F - 23093 [initandlisten] "Fatal assertion","attr": { msgid: 31360, error: "HierarchicalAcquisitionLevelViolation: Theoretical deadlock alert - InvalidWasPresent latch acquisition at src/mongo/util/synchronized_value.h:149 on latch synchronized_value::_mutex", file: "src/mongo/util/latch_analyzer.cpp", line: 231 }

https://logkeeper.mongodb.org/lobster/build/1fe9647b6f97d7b1cbc3f12477d979cd/test/5f654d43c2ab684a40168b65#bookmarks=0%2C39%2C71

This is stated in terms of implementation details of the latch analyzer. As someone who's just writing lock and unlock statements, I really shouldn't have to drill into the enums and figure out how it happened. Much richer information is available to the site that threw the assertion, but wasn't brought out in the log statement.

It looks like all the synchronized_value objects in the codebase will have the same identity string. It would be helpful in this case to have known what T was.



 Comments   
Comment by Benjamin Caimano (Inactive) [ 06/Nov/20 ]

I'm closing this since the current form of latch violation logging addresses the majority of the concerns. SERVER-52662 and SERVER-52660 have been filed to attempt to capture type information in latch names.

Comment by Githook User [ 05/Nov/20 ]

Author:

{'name': 'Ben Caimano', 'email': 'ben.caimano@10gen.com'}

Message: SERVER-51068 Provide unique errors for each variety of Latch violation
Branch: master
https://github.com/mongodb/mongo/commit/0e400a736bacfa291dbda2971f7381ad392000c9

Generated at Thu Feb 08 05:24:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.