[SERVER-24129] Mongod hang on auth request because of deadlock Created: 11/May/16  Updated: 15/Nov/21  Resolved: 19/Nov/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Zhang Youdong Assignee: Kelsey Schubert
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File stack    
Operating System: ALL
Steps To Reproduce:

It's not easy to reproduce,but in my production environment,it have appeared two times in a month.

Participants:

 Description   

Cannot auth to mongod,after many times retry,it consumes all connections, and I cannot connect anymore. Because I need to recovery the business first, I keep the pstack generated file(see accessory) and restart mongod.



 Comments   
Comment by Ramon Fernandez Marina [ 03/Oct/16 ]

zyd_com, we haven't heard back from you for a while, so we're going to close this ticket. If this is still an issue for you please provide the information requested by Thomas above and we'll reopen to investigate.

Thanks,
Ramón.

Comment by Kelsey Schubert [ 21/Jun/16 ]

Hi zyd_com,

We still need information about the your binary of MongoDB to diagnose the problem. If this is still an issue for you, can you please answer the questions in my previous comment?

Thank you,
Thomas

Comment by Kelsey Schubert [ 03/Jun/16 ]

Hi zyd_com,

We are continuing to investigate the root cause of this behavior. To help us with our investigation, would you please provide additional information about the binary of MongoDB 3.2.3 that generated the attached pstack?

  • Where did you grab the binary or how was it compiled?
  • What system are you running it on?

Thank you,
Thomas

Comment by Zhang Youdong [ 14/May/16 ]

hi, Thomas,I am busy these days and just saw your reply,thanks a lot.

I noticed that no special log in the log file when this happened(logLevel = 0), so I didn't keep the log file.

According to the pstack result, I found it similar to WT-2283 (retry in txn_update_oldest results in a hang), because many threads are doing retry in txn_update_oldest that time.

Comment by Kelsey Schubert [ 11/May/16 ]

Hi zyd_com,

Thank you for reporting this behavior. We are still examining the pstack attached to the ticket. To help us in our investigation, would you please upload the logs of the affected node when this issue occurred?

Kind regards,
Thomas

Comment by Zhang Youdong [ 11/May/16 ]

does this issue have a relationship with WT-2283 or SERVER-23778 ?

Generated at Thu Feb 08 04:05:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.