[SERVER-43211] mongos claims it is accepting connections but does not Created: 06/Sep/19 Updated: 27/Oct/23 Resolved: 09/Sep/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Networking |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Oleg Pudeyev (Inactive) | Assignee: | Mira Carey |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Sprint: | Service Arch 2019-09-09, Service Arch 2019-09-23 |
| Participants: |
| Description |
|
I started a 4.3 sharded deployment with the args specified here (https://github.com/p-mongo/dev/blob/master/script/launch-4.4-sharded-multishard):
The server is:
mlaunch produced this output:
The log from mongos is here: https://gist.github.com/p-mongo/a206f10247c39eaa92c63e1c0c977f72 Note it contains the following line:
However, connection to 14440 fails:
Also note that there are no errors indicated in the server log. There is one warning which is this:
I expect that if mongos claims that it accepts connections, that it actually accepts connections, and writes error level messages to the log if a connection attempt fails. If mongos is not accepting connections I expect it to indicate what it is doing so that I can track its progress toward being in a usable state. When my deployment gets in the state described in this ticket, it appears to be stuck in this state and killing all processes and restarting them does not seem to unstick it. I need to nuke the data directories for all mongos+mongod nodes and rebuild the entire deployment from scratch. |
| Comments |
| Comment by Oleg Pudeyev (Inactive) [ 09/Sep/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
My bad, it was a firewall rule I forgot about. Sorry for wasting so much time. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Pudeyev (Inactive) [ 06/Sep/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
mongos appears to be listening:
4.4 shell behaves the same:
Connecting to socket works:
--bind_ip_all makes no difference:
Nothing else is listening:
I am working on the logs. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Mira Carey [ 06/Sep/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've one observation, followed by a couple of questions: First the observation:
The log line "Listening on ..." proceeds directly after a call to ::listen() on each bound interface socket. Then the line "waiting for connections on port ..." comes after spawning the background thread which will start actively calling ::accept(). See transport_layer_asioc.cpp Those two lines are about as close to ready as we can get. The only bit that's missing would be another lined logged from the accepting socket background threads. Obviously we still have something wrong there, but it's unlikely to be the mongos' process failing to have called ::bind() / ::listen() / ::accept() Then the questions:
|