[SERVER-39860] Separate reporting of RSTL and PBWM locks metrics in serverStatus and currentOp Created: 27/Feb/19  Updated: 29/Oct/23  Resolved: 15/May/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.1.12

Type: Task Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Dianna Hohensee (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-12716 Docs for SERVER-39860: Separate repor... Closed
Duplicate
is duplicated by SERVER-40780 LockManager dump should show the name... Closed
Related
related to SERVER-43910 include Client/OpCtx information in L... Closed
is related to SERVER-41105 Rename parallel batch writer 'mutex' ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Storage NYC 2019-04-22, Storage NYC 2019-05-06, Storage NYC 2019-05-20
Participants:

 Description   

The introduction of the RSTL lock has considerably changed the behavior the locks.Global section of serverStatus because the RSTL metrics are currently lumped in with global lock metrics. Diagnosability will be improved if these lock metrics are reported separately.

In addition the inclusion of the PBWM metrics in the locks.Global section has hampered diagnosability in the past, so we should also separate those metrics out.

Specific strawman proposal: introduce new locks.RSTL and locks.PBWM alongside locks.Global, which will now include only global lock metrics and not RSTL or PBWM metrics:

        "locks" : {
                "Global" : {
                        "acquireCount" : ...
                        "acquireWaitCount" : ...
                        "timeAcquiringMicros" : ...
                },
                "PBWM" : ...
                "RSTL" : ...
                "Database" : ...
                "Collection" : ...
                "oplog" : ...
        },

As part of this work we should also verify that clients waiting on the RSTL and PBWM locks will be reported as queued in the globalLock.currentQueue section, as the reported queues are an important diagnostic metric.



 Comments   
Comment by Dianna Hohensee (Inactive) [ 18/Jun/19 ]

FYI, I changed "ParallelBatchWriter" lock name outputted in the lock sections of serverStatus and currentOp to "ParallelBatchWriteMode" inĀ SERVER-41105 in v4.3.1, which will be backported to 4.2.0 shortly.

Comment by Githook User [ 14/May/19 ]

Author:

{'name': 'Dianna', 'username': 'DiannaHohensee', 'email': 'dianna.hohensee@10gen.com'}

Message: SERVER-39860 Separate reporting of RSTL and PBWM locks metrics in serverStatus and currentOp
Branch: master
https://github.com/mongodb/mongo/commit/09db7023065f42ccc39dd3309536726814379c86

Comment by Brian Lane [ 02/Apr/19 ]

I set it as 4.1 required, but don't think it needs to be part of the epic.

Someone outside of the flow control project work could pick this up in a future sprint.

-Brian

Comment by Geert Bosch [ 01/Apr/19 ]

I don't see how this ticket is required for the repl set flow control project at all. It's also not the case that that project changes use of RSTL or PBWM. So, no, I don't think it makes sense to add it to the project.

Comment by Bruce Lucas (Inactive) [ 27/Feb/19 ]

Lock status in currentOp should also report global, RSTL, and PBWM separately for locks acquired and locks waiting. Not sure whether this falls out of preceding changes automatically or requires additional tweaks.

Comment by Judah Schvimer [ 27/Feb/19 ]

I will reassign to storage since they generally own the lock manager. If they would like us to do it and have suggestions on the best design, we could schedule it. No one on the repl team has knowledge about lock reporting though, that I know of. I would like to make the solution general purpose enough so we can add future "global locks" without running into this issue again.

Generated at Thu Feb 08 04:53:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.