|
Hi Jeff,
sorry it took me a a while to summarize this.
The scenario is:
- replica set with PRIMARY, 1 SECONDARY, 1 ARBITER
- Mongo Java Driver 3.10.2 (we're running with Java 8)
These are my MongoClientOptions, using a socketTimeout, as suggested:
MongoClientOptions options = MongoClientOptions.builder()
|
.writeConcern(WriteConcern.MAJORITY)
|
.readPreference(ReadPreference.primary())
|
.readConcern(ReadConcern.MAJORITY)
|
.requiredReplicaSetName("appng")
|
.socketTimeout(5000)
|
.build();
|
When starting the cluster, everything is fine, rs.status() gives me 3 members:
rs.status();
|
{
|
"set" : "appng",
|
...
|
"members" : [
|
{
|
"_id" : 0,
|
"name" : "localhost:8001",
|
"health" : 1,
|
"state" : 1,
|
"stateStr" : "PRIMARY",
|
...
|
"electionTime" : Timestamp(1558424157, 1),
|
"electionDate" : ISODate("2019-05-21T07:35:57Z"),
|
...
|
},
|
{
|
"_id" : 1,
|
"name" : "localhost:8002",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
...
|
"syncingTo" : "localhost:8001",
|
"syncSourceHost" : "localhost:8001",
|
...
|
},
|
{
|
"_id" : 2,
|
"name" : "localhost:8003",
|
"health" : 1,
|
"state" : 7,
|
"stateStr" : "ARBITER",
|
...
|
}
|
],
|
"ok" : 1
|
}
|
Now, when starting my application, everything is fine, the configuration is detected correctly.
2019-05-21 09:54:09,100 INFO ["localhost-startStop-1"] org.mongodb.driver.cluster: No server chosen by com.mongodb.client.internal.MongoClientDelegate$1@3e25dcd6 from cluster description ClusterDescription{type=REPLICA_SET, connectionMode=MULTIPLE, serverDescriptions=[ServerDescription{address=localhost:8002, type=UNKNOWN, state=CONNECTING}, ServerDescription{address=localhost:8001, type=UNKNOWN, state=CONNECTING}]}. Waiting for 30000 ms before timing out
|
2019-05-21 09:54:09,107 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8002"] org.mongodb.driver.connection: Opened connection [connectionId{localValue:1, serverValue:7}] to localhost:8002
|
2019-05-21 09:54:09,107 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8001"] org.mongodb.driver.connection: Opened connection [connectionId{localValue:2, serverValue:15}] to localhost:8001
|
2019-05-21 09:54:09,119 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8002"] org.mongodb.driver.cluster: Monitor thread successfully connected to server with description ServerDescription{address=localhost:8002, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 12]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=6747329, setName='appng', canonicalAddress=localhost:8002, hosts=[localhost:8002, localhost:8001], passives=[], arbiters=[localhost:8003], primary='localhost:8001', tagSet=TagSet{[]}, electionId=null, setVersion=3, lastWriteDate=Tue May 21 09:54:09 CEST 2019, lastUpdateTimeNanos=37785965920298}
|
2019-05-21 09:54:09,121 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8002"] org.mongodb.driver.cluster: Adding discovered server localhost:8003 to client view of cluster
|
2019-05-21 09:54:09,121 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8001"] org.mongodb.driver.cluster: Monitor thread successfully connected to server with description ServerDescription{address=localhost:8001, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 12]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=9617219, setName='appng', canonicalAddress=localhost:8001, hosts=[localhost:8002, localhost:8001], passives=[], arbiters=[localhost:8003], primary='localhost:8001', tagSet=TagSet{[]}, electionId=7fffffff000000000000001c, setVersion=3, lastWriteDate=Tue May 21 09:54:09 CEST 2019, lastUpdateTimeNanos=37785968337383}
|
2019-05-21 09:54:09,122 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8001"] org.mongodb.driver.cluster: Setting max election id to 7fffffff000000000000001c from replica set primary localhost:8001
|
2019-05-21 09:54:09,122 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8001"] org.mongodb.driver.cluster: Setting max set version to 3 from replica set primary localhost:8001
|
2019-05-21 09:54:09,122 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8001"] org.mongodb.driver.cluster: Discovered replica set primary localhost:8001
|
2019-05-21 09:54:09,125 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8003"] org.mongodb.driver.connection: Opened connection [connectionId{localValue:3, serverValue:15}] to localhost:8003
|
2019-05-21 09:54:09,126 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8003"] org.mongodb.driver.cluster: Monitor thread successfully connected to server with description ServerDescription{address=localhost:8003, type=REPLICA_SET_ARBITER, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 12]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=null, roundTripTimeNanos=1173638, setName='appng', canonicalAddress=localhost:8003, hosts=[localhost:8002, localhost:8001], passives=[], arbiters=[localhost:8003], primary='localhost:8001', tagSet=TagSet{[]}, electionId=null, setVersion=3, lastWriteDate=Tue May 21 09:53:59 CEST 2019, lastUpdateTimeNanos=37785973691794}
|
2019-05-21 09:54:09,134 INFO ["localhost-startStop-1"] org.mongodb.driver.connection: Opened connection [connectionId{localValue:4, serverValue:16}] to localhost:8001
|
Now, I do a regular shutdown of the master (localhost:8001) -> SECONDARY gets new PRIMARY 
Logs of SECONDARY:
2019-05-21T09:58:23.911+0200 I REPL [replexec-11] Starting an election, since we've seen no PRIMARY in the past 10000ms
|
2019-05-21T09:58:23.911+0200 I REPL [replexec-11] conducting a dry run election to see if we could be elected. current term: 28
|
2019-05-21T09:58:23.911+0200 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to localhost:8001
|
2019-05-21T09:58:23.912+0200 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to localhost:8001 - HostUnreachable: Connection refused
|
2019-05-21T09:58:23.912+0200 I REPL [replexec-8] VoteRequester(term 28 dry run) received a yes vote from localhost:8003; response message: { term: 28, voteGranted: true, reason: "", ok: 1.0 }
|
2019-05-21T09:58:23.912+0200 I ASIO [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to localhost:8001 due to failed operation on a connection
|
2019-05-21T09:58:23.913+0200 I REPL [replexec-8] dry election run succeeded, running for election in term 29
|
2019-05-21T09:58:23.962+0200 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to localhost:8001
|
2019-05-21T09:58:23.962+0200 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to localhost:8001 - HostUnreachable: Connection refused
|
2019-05-21T09:58:23.962+0200 I ASIO [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to localhost:8001 due to failed operation on a connection
|
2019-05-21T09:58:23.963+0200 I REPL [replexec-10] VoteRequester(term 29) failed to receive response from localhost:8001: HostUnreachable: Connection refused
|
2019-05-21T09:58:23.964+0200 I REPL [replexec-9] VoteRequester(term 29) received a yes vote from localhost:8003; response message: { term: 29, voteGranted: true, reason: "", ok: 1.0 }
|
2019-05-21T09:58:23.964+0200 I REPL [replexec-9] election succeeded, assuming primary role in term 29
|
2019-05-21T09:58:23.964+0200 I REPL [replexec-9] transition to PRIMARY from SECONDARY
|
2019-05-21T09:58:23.964+0200 I REPL [replexec-9] Resetting sync source to empty, which was :27017
|
2019-05-21T09:58:23.964+0200 I REPL [replexec-9] Entering primary catch-up mode.
|
rs.status() also confirms that everything is fine:
rs.status();
|
{
|
"set" : "appng",
|
...
|
"members" : [
|
{
|
"_id" : 0,
|
"name" : "localhost:8001",
|
"health" : 0,
|
"state" : 8,
|
"stateStr" : "(not reachable/healthy)",
|
...
|
},
|
{
|
"_id" : 1,
|
"name" : "localhost:8002",
|
"health" : 1,
|
"state" : 1,
|
"stateStr" : "PRIMARY",
|
...
|
"electionTime" : Timestamp(1558425503, 1),
|
"electionDate" : ISODate("2019-05-21T07:58:23Z"),
|
...
|
},
|
{
|
"_id" : 2,
|
"name" : "localhost:8003",
|
"health" : 1,
|
"state" : 7,
|
"stateStr" : "ARBITER",
|
...
|
}
|
],
|
"ok" : 1
|
}
|
When I'm now accessing my application, the following happens:
2019-05-21 09:58:19,182 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8003"] org.mongodb.driver.cluster: Monitor thread successfully connected to server with description ServerDescription{address=localhost:8003, type=REPLICA_SET_ARBITER, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 12]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=null, roundTripTimeNanos=1025366, setName='appng', canonicalAddress=localhost:8003, hosts=[localhost:8002, localhost:8001], passives=[], arbiters=[localhost:8003], primary='null', tagSet=TagSet{[]}, electionId=null, setVersion=3, lastWriteDate=Tue May 21 09:58:09 CEST 2019, lastUpdateTimeNanos=38036029203176}
|
2019-05-21 09:58:19,186 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8002"] org.mongodb.driver.cluster: Monitor thread successfully connected to server with description ServerDescription{address=localhost:8002, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 12]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=1106506, setName='appng', canonicalAddress=localhost:8002, hosts=[localhost:8002, localhost:8001], passives=[], arbiters=[localhost:8003], primary='null', tagSet=TagSet{[]}, electionId=null, setVersion=3, lastWriteDate=Tue May 21 09:58:09 CEST 2019, lastUpdateTimeNanos=38036033318648}
|
2019-05-21 09:58:19,189 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8001"] org.mongodb.driver.cluster: Exception in monitor thread while connecting to server localhost:8001
|
com.mongodb.MongoSocketOpenException: Exception opening socket
|
at com.mongodb.internal.connection.SocketStream.open(SocketStream.java:70)
|
at com.mongodb.internal.connection.InternalStreamConnection.open(InternalStreamConnection.java:126)
|
at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:131)
|
at java.lang.Thread.run(Thread.java:748)
|
Caused by: java.net.ConnectException: Connection refused (Connection refused)
|
at java.net.PlainSocketImpl.socketConnect(Native Method)
|
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
|
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
|
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
|
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
|
at java.net.Socket.connect(Socket.java:589)
|
at com.mongodb.internal.connection.SocketStreamHelper.initialize(SocketStreamHelper.java:64)
|
at com.mongodb.internal.connection.SocketStream.initializeSocket(SocketStream.java:79)
|
at com.mongodb.internal.connection.SocketStream.open(SocketStream.java:65)
|
... 3 more
|
2019-05-21 09:58:29,184 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8003"] org.mongodb.driver.cluster: Monitor thread successfully connected to server with description ServerDescription{address=localhost:8003, type=REPLICA_SET_ARBITER, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 12]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=null, roundTripTimeNanos=1021370, setName='appng', canonicalAddress=localhost:8003, hosts=[localhost:8002, localhost:8001], passives=[], arbiters=[localhost:8003], primary='localhost:8002', tagSet=TagSet{[]}, electionId=null, setVersion=3, lastWriteDate=Tue May 21 09:58:09 CEST 2019, lastUpdateTimeNanos=38046031501403}
|
2019-05-21 09:58:29,188 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8002"] org.mongodb.driver.cluster: Monitor thread successfully connected to server with description ServerDescription{address=localhost:8002, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 12]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=1068662, setName='appng', canonicalAddress=localhost:8002, hosts=[localhost:8002, localhost:8001], passives=[], arbiters=[localhost:8003], primary='localhost:8002', tagSet=TagSet{[]}, electionId=7fffffff000000000000001d, setVersion=3, lastWriteDate=Tue May 21 09:58:25 CEST 2019, lastUpdateTimeNanos=38046035124501}
|
2019-05-21 09:58:29,188 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8002"] org.mongodb.driver.cluster: Setting max election id to 7fffffff000000000000001d from replica set primary localhost:8002
|
2019-05-21 09:58:29,188 INFO ["cluster-ClusterId{value='5ce3aea150c1517f2d165529', description='null'}-localhost:8002"] org.mongodb.driver.cluster: Discovered replica set primary localhost:8002
|
2019-05-21 10:03:35,324 INFO ["http-nio-8080-exec-6"] org.mongodb.driver.connection: Closed connection [connectionId{localValue:15, serverValue:8}] to localhost:8002 because there was a socket exception raised by this connection.
|
So the driver correctly is able to detect that localhost:8002 became the new PRIMARY.
But now the weird thing is, although the new primary was eleceted, subsequent calls do throw an exception:
2019-05-21 10:09:56,355 INFO ["http-nio-8080-exec-5"] org.mongodb.driver.connection: Opened connection [connectionId{localValue:85, serverValue:14}] to localhost:8002
|
2019-05-21 10:10:01,363 INFO ["http-nio-8080-exec-5"] org.mongodb.driver.connection: Closed connection [connectionId{localValue:85, serverValue:14}] to localhost:8002 because there was a socket exception raised by this connection.
|
...
|
com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message
|
at com.mongodb.internal.connection.InternalStreamConnection.translateReadException(InternalStreamConnection.java:553)
|
at com.mongodb.internal.connection.InternalStreamConnection.receiveMessage(InternalStreamConnection.java:444)
|
at com.mongodb.internal.connection.InternalStreamConnection.receiveCommandMessageResponse(InternalStreamConnection.java:295)
|
at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:255)
|
at com.mongodb.internal.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:99)
|
at com.mongodb.internal.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:444)
|
at com.mongodb.internal.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:72)
|
at com.mongodb.internal.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:200)
|
at com.mongodb.internal.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:269)
|
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:131)
|
at com.mongodb.operation.MixedBulkWriteOperation.executeCommand(MixedBulkWriteOperation.java:433)
|
at com.mongodb.operation.MixedBulkWriteOperation.executeBulkWriteBatch(MixedBulkWriteOperation.java:259)
|
at com.mongodb.operation.MixedBulkWriteOperation.access$700(MixedBulkWriteOperation.java:70)
|
at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:203)
|
at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:194)
|
at com.mongodb.operation.OperationHelper.withReleasableConnection(OperationHelper.java:424)
|
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:194)
|
at com.mongodb.operation.BaseWriteOperation.execute(BaseWriteOperation.java:148)
|
at com.mongodb.operation.BaseWriteOperation.execute(BaseWriteOperation.java:52)
|
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:193)
|
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:166)
|
at com.mongodb.DBCollection.executeWriteOperation(DBCollection.java:347)
|
at com.mongodb.DBCollection.update(DBCollection.java:570)
|
at com.mongodb.DBCollection.update(DBCollection.java:493)
|
at com.mongodb.DBCollection.update(DBCollection.java:469)
|
at com.mongodb.DBCollection.update(DBCollection.java:447)
|
at com.mongodb.DBCollection.update(DBCollection.java:512)
|
at com.mongodb.DBCollection.update(DBCollection.java:527)
|
...(call from application)
|
Caused by: java.net.SocketTimeoutException: Read timed out
|
at java.net.SocketInputStream.socketRead0(Native Method)
|
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
|
at java.net.SocketInputStream.read(SocketInputStream.java:171)
|
at java.net.SocketInputStream.read(SocketInputStream.java:141)
|
at com.mongodb.internal.connection.SocketStream.read(SocketStream.java:109)
|
at com.mongodb.internal.connection.InternalStreamConnection.receiveResponseBuffers(InternalStreamConnection.java:570)
|
at com.mongodb.internal.connection.InternalStreamConnection.receiveMessage(InternalStreamConnection.java:441)
|
... 49 more
|
|
So I'm wondering why this exception is thrown, since the PRIMARY localhost:8002 is doing fine.
I hope this makes my problem comprehensible.
Looking forward for your reply and thank you in advance for your efforts.
Regards
Matthias
|