[SERVER-40314] Cannot connect to replica set Created: 22/Mar/19  Updated: 01/Apr/19  Resolved: 01/Apr/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 4.0.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Andrey Kostin Assignee: Eric Sedor
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-1889 Support different networks / nics for... Closed
Operating System: ALL
Steps To Reproduce:

Not sure

Participants:

 Description   

I'm trying to add new member to replica set and that new member receive connections from all other nodes but it can't connect to them.

That's what I see in logs on all other nodes (xxx.xxx.xxx.xxx is the IP of new member):

2019-03-22T22:24:10.804+0200 I NETWORK  [listener] connection accepted from xxx.xxx.xxx.xxx:56590 #339 (22 connections now open)
2019-03-22T22:24:10.804+0200 D EXECUTOR [listener] Starting new executor thread in passthrough mode
2019-03-22T22:24:10.805+0200 D COMMAND  [conn339] run command local.$cmd { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx", $db: "local" }
2019-03-22T22:24:10.805+0200 I COMMAND  [conn339] command local.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx", $db: "local" } numYields:0 reslen:320 locks:{} protocol:op_query 0ms
2019-03-22T22:24:10.808+0200 D COMMAND  [conn339] run command local.$cmd { saslContinue: 1, payload: BinData(0, 633D626977732C723D61542F4A666D356F614F50586A636C65466C6A6B2F363739787A3864352B686E4633504D676B3331334C765579526370316235625A6853714541417942...), conversationId: 1, $db: "local" }
2019-03-22T22:24:10.808+0200 I COMMAND  [conn339] command local.$cmd command: saslContinue { saslContinue: 1, payload: BinData(0, 633D626977732C723D61542F4A666D356F614F50586A636C65466C6A6B2F363739787A3864352B686E4633504D676B3331334C765579526370316235625A6853714541417942...), conversationId: 1, $db: "local" } numYields:0 reslen:249 locks:{} protocol:op_query 0ms
2019-03-22T22:24:10.811+0200 D COMMAND  [conn339] run command local.$cmd { saslContinue: 1, payload: BinData(0, ), conversationId: 1, $db: "local" }
2019-03-22T22:24:10.811+0200 I ACCESS   [conn339] Successfully authenticated as principal __system on local
2019-03-22T22:24:10.811+0200 I COMMAND  [conn339] command local.$cmd command: saslContinue { saslContinue: 1, payload: BinData(0, ), conversationId: 1, $db: "local" } numYields:0 reslen:219 locks:{} protocol:op_query 0ms
2019-03-22T22:24:10.814+0200 D COMMAND  [conn339] run command admin.$cmd { _isSelf: 1, $db: "admin" }
2019-03-22T22:24:10.814+0200 I COMMAND  [conn339] command admin.$cmd command: _isSelf { _isSelf: 1, $db: "admin" } numYields:0 reslen:194 locks:{} protocol:op_query 0ms
2019-03-22T22:24:10.816+0200 D NETWORK  [conn339] Session from xxx.xxx.xxx.xxx:56590 encountered a network error during SourceMessage
2019-03-22T22:24:10.816+0200 I NETWORK  [conn339] end connection xxx.xxx.xxx.xxx:56590 (21 connections now open)
2019-03-22T22:24:10.816+0200 D NETWORK  [conn339] Cancelling outstanding I/O operations on connection to xxx.xxx.xxx.xxx:56590



 Comments   
Comment by Andrey Kostin [ 01/Apr/19 ]

Ok, thank you, I'll ask my grandchildren to check that 8-years old feature request.

Comment by Eric Sedor [ 01/Apr/19 ]

Thanks for your patience. It looks like this fits under SERVER-1889. Can you please watch that ticket for updates as we continue working internally to address such use-cases?

Comment by Andrey Kostin [ 26/Mar/19 ]

And this is the excerpt from iptables config file I use for port forwarding. Maybe you'll find it useful 

 

*filter
:INPUT DROP [0:0]
 :OUTPUT DROP [0:0]
 :FORWARD DROP [0:0]
-F
-X
-A INPUT -i lo -j ACCEPT
-A INPUT -i lxdbr0 -j ACCEPT
-A INPUT -i enp3s0 -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p tcp -m state --state NEW --dport 27018 -s {NODE_1_IP} -j ACCEPT
-A INPUT -p tcp -m state --state NEW --dport 27018 -s {NODE_2_IP} -j ACCEPT
-A INPUT -p tcp -m state --state NEW --dport 27018 -s {ARB_IP} -j ACCEPT
 
-A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT
 
-A OUTPUT -j ACCEPT
 
-A FORWARD -p tcp -d 10.100.0.101 --dport 27017 -s {NODE_1_IP} -j ACCEPT
-A FORWARD -p tcp -d 10.100.0.101 --dport 27017 -s {NODE_2_IP} -j ACCEPT
-A FORWARD -p tcp -d 10.100.0.101 --dport 27017 -s {ARB_IP} -j ACCEPT
-A FORWARD -i enp3s0 -o lxdbr0 -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
-A FORWARD -i lxdbr0 -j ACCEPT
COMMIT
*raw
:PREROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
COMMIT
*nat
:PREROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
-A PREROUTING -i enp3s0 -p tcp --dport 27018 -j DNAT --to-destination 10.100.0.101:27017
-A POSTROUTING -o enp3s0 -j MASQUERADE
COMMIT
*mangle
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
COMMIT

 

 

Comment by Andrey Kostin [ 26/Mar/19 ]

{
    "_id" : "r24",
    "version" : 1312579,
    "protocolVersion" : NumberLong(1),
    "writeConcernMajorityJournalDefault" : true,
    "members" : [
        {
            "_id" : 0,
            "host" : "mongo-1.example.com:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 2,
            "tags" : {
                
            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 1,
            "host" : "mongo-2.example.com:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {
                
            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 3,
            "host" : "mongo-arb.example.com:27018",
            "arbiterOnly" : true,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 0,
            "tags" : {
                
            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 4,
            "host" : "mongo-q.example.com:27018",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 0,
            "tags" : {
                
            },
            "slaveDelay" : NumberLong(0),
            "votes" : 0
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatIntervalMillis" : 2000,
        "heartbeatTimeoutSecs" : 10,
        "electionTimeoutMillis" : 10000,
        "catchUpTimeoutMillis" : 2000,
        "catchUpTakeoverDelayMillis" : 30000,
        "getLastErrorModes" : {
            
        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        },
        "replicaSetId" : ObjectId("58b5fbc074f1ef9bce40170f")
    }
}

The first 3 nodes are not behind NAT and they're really configured to use their ports, but the new 4th node is behind NAT and is listening on port 27017 inside container with port forwarding configured on it's host.

Comment by Eric Sedor [ 26/Mar/19 ]

Thanks for your report; we are investigating. In case it helps, can you please provide your replica set config? Omitting IPs is okay.

Comment by Andrey Kostin [ 23/Mar/19 ]

So, If I want to use mongod over NAT and make it accessible through port 28017, then I have to set port 28017 in mongod.conf. Simple port forwarding is not enough, these ports must be equal. Why is that? I want to bind mongod instances to port 27017 in all containers and make these instances accessible through different ports over internet.

Generated at Thu Feb 08 04:54:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.