[SERVER-53232] dropDatabase took so long due to can't acquire global lock Created: 04/Dec/20  Updated: 31/Dec/20  Resolved: 07/Dec/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Minor - P4
Reporter: phoenix Liu Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File diagnostic.data.tar.gz     Text File mongod.log    
Participants:

 Description   

hi, i encountered a problem about dropDatabase recently with my 4.0.3 replica set cluster.

here's the mongod's log and i can't find the reason:

Fri Dec  4 17:43:50.219 I NETWORK  [conn6054599] received client metadata from 100.67.165.240:47353 conn6054599: { driver: { name: "mongo-go-driver", version: "v1.0.0" }, os: { type: "linux", architecture: "amd64" }, platform: "go1.10.2" }
...
...
Fri Dec  4 17:44:59.669 I COMMAND  [conn5561779] command admin.$cmd appName: "mongoimport" command: isMaster { isMaster: 1, $clusterTime: { clusterTime: Timestamp(1607075023, 1), signature: { hash: BinData(0, 1D71E9754104E92D9DE02EDF5BA5AAAC5D64B195), keyId: 6848505140803010590 } }, $db: "admin", $readPreference: { mode: "primaryPreferred" } } numYields:0 reslen:866 locks:{ Global: { acquireCount: { r: 1 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 65714573 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_msg 65716ms
Fri Dec  4 17:44:59.667 I COMMAND  [conn6054599] command tnt-dnio46v0e command: dropDatabase { dropDatabase: 1, lsid: { id: UUID("949b9c3c-c138-46cc-92d4-e57d7d1e87ba") }, $clusterTime: { clusterTime: Timestamp(1607075029, 1), signature: { hash: BinData(0, 9EC708C8A07F2B59BD858A86F1E4653F29CFF207), keyId: 6848505140803010590 } }, $db: "tnt-dnio46v0e" } numYields:0 reslen:333 locks:{ Global: { acquireCount: { r: 2, w: 1, W: 1 }, acquireWaitCount: { W: 1 }, timeAcquiringMicros: { W: 69436501 } }, Database: { acquireCount: { W: 1 } } } protocol:op_msg 69436ms
Fri Dec  4 17:44:59.669 I NETWORK  [conn6054599] end connection 100.67.165.240:47353 (140 connections now open)

  1. this seems to be highly related to the 'dropDatabase' command;
  2. stuck affect all cmds during the time and application will get timeout error (of course since it's global lock issue)
  3. not related to resource(cpu, memory) bottleneck;
  4. but not every call of 'dropDatabase' command will trigger a stuck; 
  5. i can't do `db.currentOp()` to get more infos exactly when this happened cause i don't know when will it happen;

my question:

  1. which operation will hold the global lock for such long time?
  2. why does this appear?

the mongod.log files and diagnostic.data is attached, BTW, time zone is GMT+8 

Thank you!



 Comments   
Comment by phoenix Liu [ 08/Dec/20 ]

Hi Kelsey,

Thanks for the reply.
Could you please send me the link of Jira issue or git commit related?
I tried before but didn't find it.

Comment by Kelsey Schubert [ 07/Dec/20 ]

Hi phoenixxliu@tencent.com, this issue has been resolved in MongoDB 4.4, which is licensed under SSPL.

Generated at Thu Feb 08 05:30:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.