[SERVER-44058] Spontaneous deadlock on mongos Created: 17/Oct/19 Updated: 24/Nov/19 Resolved: 24/Nov/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Sergey Zagursky | Assignee: | Dmitry Agranat |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Participants: |
| Description |
|
We've hit an issue with spontaneous mongos deadlocks. Unfortunately I can't provide any hints for reproducing it but I've managed to take a core dump from troubled mongos process. See attached file with stacktraces from the core dump. |
| Comments |
| Comment by Dmitry Agranat [ 24/Nov/19 ] | ||||||||||||||||||||
|
Hi sz, Thank you for the update. If this is still an issue for you, please provide additional information and we will reopen the ticket. Regards, | ||||||||||||||||||||
| Comment by Sergey Zagursky [ 20/Nov/19 ] | ||||||||||||||||||||
|
Hi @Dmitry Agranat! We haven't encountered this issue since then. I can't attach the logs from diagnostic.data now because they were rolled over. | ||||||||||||||||||||
| Comment by Dmitry Agranat [ 11/Nov/19 ] | ||||||||||||||||||||
|
Hi sz, If this is still an issue for you, please upload the requested information. Thanks, | ||||||||||||||||||||
| Comment by Dmitry Agranat [ 20/Oct/19 ] | ||||||||||||||||||||
|
Thanks Sergey, Looking at the provided mongoS log, we can see a cluster of such errors shortly after the time you've mentioned:
This suggests that during this time, we had elections on shard product6, which would also explain the mentioned "waiting for some event". To validate this theory, please upload mongoD logs the $dbpath/diagnostic.data directory (the contents are described here), covering the time of this event from all members of this shard. Thanks, | ||||||||||||||||||||
| Comment by Sergey Zagursky [ 17/Oct/19 ] | ||||||||||||||||||||
|
Uploaded mongos.log.20191017. The incident started at 09:25 UTC (approximately). Timestamps in the log file are UTC too. | ||||||||||||||||||||
| Comment by Dmitry Agranat [ 17/Oct/19 ] | ||||||||||||||||||||
|
Hi Sergey, here is the uploader link. Please also mention the time and the timezone of the reported event. | ||||||||||||||||||||
| Comment by Sergey Zagursky [ 17/Oct/19 ] | ||||||||||||||||||||
|
As for logs I need to upload them using support uploader. Can you provide a link to it? | ||||||||||||||||||||
| Comment by Sergey Zagursky [ 17/Oct/19 ] | ||||||||||||||||||||
|
Sure. Sorry for not reporting full info. We use 4.0.10.
| ||||||||||||||||||||
| Comment by Dmitry Agranat [ 17/Oct/19 ] | ||||||||||||||||||||
|
Hi sz, Could you provide the full mongoS log covering the time of the event as well as MongoDB version? Thanks, |