[SERVER-44490] The mongo command connecting mongodb instance hangs forever. Created: 08/Nov/19 Updated: 29/Mar/20 Resolved: 29/Mar/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Chengcheng Ma | Assignee: | Dmitry Agranat |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Steps To Reproduce: | [^diagnostic.data.tar.gz] |
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
We are now facing a issue, which trying to connect to the mongod through mongo client hangs forever. Nothing response, until Ctr+C stop the attempt. And the output of the stop is, MongoDB shell version v3.6.8
Another strange thing is that the dbpath of the node is pretty larger than other replica set member. The Primary node's data dir is about 20GB , and the node is about 181GB. When I list all the files in the dbpath, the WiredTigerLAS.wt file is much large, 171GB. It is wired, because we cannot connect to the node any more, so why the WT cache is ALWAYS growing.
We also try to dump the stack info while trying to connect, and here is the log:
We searched but not found any clue. The MongoDB is running in the docker container, and the version of mongo is 3.6.8. We also attached the diagnostic.data.
|
| Comments |
| Comment by Dmitry Agranat [ 29/Mar/20 ] |
|
Hi cora_ma, We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket. Regards, |
| Comment by Dmitry Agranat [ 19/Nov/19 ] |
|
Hi cora_ma, thank you for providing the requested information. The cluster in question is underprovisioned and most of the time cannot sustain the applied load. This is the main reason some operations/connections timeout. In addition to upgrading your HW, based on your workload, I also recommend upgrading MongoDB to 4.x to benefit from [Non-Blocking Secondary Read|shttps://www.mongodb.com/blog/post/mongodb-40-nonblocking-secondary-reads]. |
| Comment by Chengcheng Ma [ 13/Nov/19 ] |
|
@Dmitry Agranat, thank you so much for you reply.
All the info you need are all attached. |
| Comment by Dmitry Agranat [ 12/Nov/19 ] |
|
Hi cora_ma, Thank you for the additional information and the callstack output. It shows that the shell is just waiting for a SASL auth command to return from the server. The server might be just having trouble processing the command. The fact that you've mentioned that WiredTigerLAS.wt file was 171 GB during this issue could also indicate that the server had some performance issues at that time. Could you repeat connecting to mongod and if it hangs again, please provide the following data and upload it into our secure portal?
Thanks, |
| Comment by Chengcheng Ma [ 11/Nov/19 ] |
|
Hi @Dmitry Agranat , I'm not sure whether the problem we are facing is duplicate of I also tried to use gdb to attach the hanged connecting pid, and here are the output:
# gdb mongo 12710
So if you are sure about the issue we are encountering is exact the case which you posted , I will follow that one for updates.
Thanks a lot. |
| Comment by Dmitry Agranat [ 10/Nov/19 ] |
|
Hi cora_ma, Thank you for the report. This issue seems to be a duplicate of Please follow Thanks, |