-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
Affects Version/s: 2.4.9
-
Component/s: Internal Client
-
None
-
Environment:sharded cluster with 1 shard: hosta:30000, hostb:30001 and hostb:30002
-
Fully Compatible
-
ALL
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
- issuing secondary reads from mongos
- there is a secondary connection pinned ( hostb:30002)
- secondary goes to a blackhole (packets are dropped)
- the next query will try to reuse the dead secondary despite replica set monitor detecting that the node is unreachable
- observe TCP timeout (15 minutes by default)
Sample mongoS log:
Fri Jan 31 17:09:05.637 [conn5] dbclient_rs nodes[0].ok = true hosta:30000 Fri Jan 31 17:09:05.637 [conn5] dbclient_rs nodes[1].ok = false hostb:30001 Fri Jan 31 17:09:05.637 [conn5] dbclient_rs nodes[2].ok = false hostb:30002 Fri Jan 31 17:09:05.637 [conn5] trying reconnect to hostb:30001 Fri Jan 31 17:09:10.636 [conn5] reconnect hostb:30001 failed couldn't connect to server hostb:30001 Fri Jan 31 17:09:10.636 [conn5] ReplicaSetMonitor::_checkConnection: caught exception hostb:30001 socket exception [CONNECT_ERROR] for hostb:30001 Fri Jan 31 17:09:10.636 [conn5] trying reconnect to hostb:30002 Fri Jan 31 17:09:15.636 [conn5] reconnect hostb:30002 failed couldn't connect to server hostb:30002 Fri Jan 31 17:09:15.636 [conn5] ReplicaSetMonitor::_checkConnection: caught exception hostb:30002 socket exception [CONNECT_ERROR] for hostb:30002 Fri Jan 31 17:09:16.636 [conn5] warning: No primary detected for set shard01 Fri Jan 31 17:09:16.636 [conn5] User Assertion: 10009:ReplicaSetMonitor no master found for set: shard01 Fri Jan 31 17:09:16.636 [conn5] dbclient_rs say using secondary or tagged node selection in shard01, read pref is { pref: "secondary only", tags: [ {} ] } (primary : hostb:30001, lastTagged : hostb:30002) Fri Jan 31 17:09:16.636 [conn5] dbclient_rs selecting compatible last used node hostb:30002 Fri Jan 31 17:09:16.637 [conn5] [pcursor] initialized query (lazily) on shard shard01:shard01/hosta:30000,hostb:30001,hostb:30002, current connection state is { state: { conn: "shard01/hosta:30000,hostb:30001,hostb:30002", vinfo: "shard01:shard01/hosta:30000,hostb:30001,hostb:30002", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false } Fri Jan 31 17:09:16.637 [conn5] [pcursor] finishing over 1 shards Fri Jan 31 17:09:16.637 [conn5] [pcursor] finishing on shard shard01:shard01/hosta:30000,hostb:30001,hostb:30002, current connection state is { state: { conn: "shard01/hosta:30000,hostb:30001,hostb:30002", vinfo: "shard01:shard01/hosta:30000,hostb:30001,hostb:30002", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false } Fri Jan 31 17:24:47.816 [conn5] Socket recv() errno:110 Connection timed out 10.225.15.113:30002
- is related to
-
SERVER-13125 DBClientRS should check that pinned hosts still match read preference
-
- Closed
-