[SERVER-45012] Add transaction information to lockInfo Created: 07/Dec/19  Updated: 06/Dec/22  Resolved: 17/Jan/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: David Bartley Assignee: Backlog - Storage Execution Team
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-43910 include Client/OpCtx information in L... Closed
Assigned Teams:
Storage Execution
Participants:

 Description   

We're debugging a recent deadlock around replSetReconfig. Our hypothesis is that we have active transactions, which are holding a global IX lock, and that's blocking replSetReconfig's X lock (which seems consistent with SERVER-43242 and SERVER-32685). We have a dump of both lockInfo and currentOp while replSetReconfig was blocked, so we can confirm that replSetReconfig is waiting for a global X lock (on 4.0 – SERVER-37945 would seemingly fix this on 4.2, but that's a separate issue). However, we only see the following (repeated quite a few times) on the "granted" list:

      {
        "compatibleFirst": false,
        "convertMode": "NONE",
        "enqueueAtFront": false,
        "mode": "IX"
      },

Our best guess is that these are active transactions, but it's difficult to prove that. On other locks we see "opid", "connectionId", and "desc", so we assume these aren't associated with an active op. It'd be nice if "lsid" and "txnNumber" were included as well.



 Comments   
Comment by David Bartley [ 17/Jan/20 ]

Is there any documentation on that effort?

Comment by Connie Chen [ 17/Jan/20 ]

The effort to do this is outweighed by our efforts to remove locking, becoming obsolete in 4.4.

Comment by Brian Lane [ 12/Dec/19 ]

Great! Thanks for the quick response.

Comment by David Bartley [ 12/Dec/19 ]

No, that's fine. We backported some of the commits in SERVER-43910 to 4.0 and get at least "lsid" now, which is sufficient.

Comment by Brian Lane [ 12/Dec/19 ]

Hi bartle,

We have slotted this issue into our plan for next quarter. Does that timing work for you or were you needing something sooner since you mentioned you are currently debugging a deadlock.

-Brian

Comment by Danny Hatcher (Inactive) [ 10/Dec/19 ]

As mentioned in SERVER-43910, moving this to the Execution team to verify whether or not the transaction information is contained in the updated lock output.

Comment by David Bartley [ 07/Dec/19 ]

I just found SERVER-43910, which possibly would have the same effect?

Generated at Thu Feb 08 05:07:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.