[SERVER-26702] Config Server connection refused Created: 19/Oct/16 Updated: 03/Aug/17 Resolved: 11/Jul/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.2.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Darshan Shah | Assignee: | Kelsey Schubert |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
Of the 3 node CSRS, one node is consistently having connection refused problems as is seen in it's own log as well as other config servers and all mongos logs:
This is the 3rd node of the CSRS and has dbpath pointing to a netapp volume. Note that the limits are bumped up pretty high and there is minimal load on the cluster so it should not have anything to do with lack of available sockets/file descriptors to open. The sharded cluster is running MongoDb 3.2.5 with WiredTiger. |
| Comments |
| Comment by Kelsey Schubert [ 11/Jul/17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi darshan.shah@interactivedata.com, Sorry for the delay getting back to you. Unfortunately, we cannot reproduce this issue, and it's likely that the root cause of this behavior is outside of MongoDB. Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group. Kind regards, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Darshan Shah [ 20/Oct/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Log file for the node reboot just after the problem occurred. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Darshan Shah [ 20/Oct/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This is the output from rs.status() as of now:
This issue is intermittent - Config server works just fine preceding and following the block of time when this issue occurs. I will attach a log file I have from a restart of the node just after this problem occurred yesterday. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 20/Oct/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
My apologies, I should have also asked for the output of rs.status(), which should tell us more about the unreachability of one of its members – can you please send that as well? One thing you can try is to reboot the node that's having problems and capture the log. I'm looking for useful startup warnings/errors that may tell us more. You can also try to resync the failing node. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Darshan Shah [ 20/Oct/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Here is the output from rs.conf():
Unfortunately, the log rolled over so it's not available for the particular time frame - will keep monitoring and save it next time. Thanks, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 19/Oct/16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Unfortunately there's not enough information in the log snippet you sent to determine if you've found a bug or if this is a configuration issue. Can you please upload the following?
Thanks, |