-
Type: Improvement
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 2.6.4
-
Component/s: Replication, Sharding
-
None
qIn sharded clusters with a read-only shard (imagine that the main DC is offline) new read queries with read preference "secondary" are trying to get the primary first to set the shard version. If primary is down, a replica set refresh is triggered in the function getHostOrRefresh. The problem is that its done inline with the user query, and if the primary is in TCP blackhole, user queries will experience additional latency.
Here is the mongoS log (logLevel 5) with additional logging in getHostOrRefresh and a stack trace for getHostOrRefresh(primary_only):
2014-09-24T14:14:40.735-0400 [conn1] Request::process begin ns: test.c msg id: 95 op: 2004 attempt: 0 2014-09-24T14:14:40.736-0400 [conn1] query: test.c { query: {}, $readPreference: { mode: "secondary" } } ntoreturn: 0 options: 4 2014-09-24T14:14:40.736-0400 [conn1] pcursor creating pcursor over QSpec { ns: "test.c", n2skip: 0, n2return: 0, options: 4, query: { query: {}, $readPreference: { mode: "secondary" } }, fields: {} } and CInfo { v_ns: "", filter: {} } 2014-09-24T14:14:40.736-0400 [conn1] pcursor initializing over 1 shards required by [unsharded @ shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019] 2014-09-24T14:14:40.736-0400 [conn1] pcursor initializing on shard shard01q:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: {}, retryNext: false, init: false, finish: false, errored: false } 2014-09-24T14:14:40.736-0400 [conn1] ReplicaSetMonitor::createIfNeeded shard01 2014-09-24T14:14:40.736-0400 [conn1] ReplicaSetMonitor::get shard01 2014-09-24T14:14:40.737-0400 [conn1] getHostOrRefresh: 4 2014-09-24T14:14:40.737-0400 [conn1] ReplicaSetMonitor::get shard01 2014-09-24T14:14:40.737-0400 [conn1] ReplicaSetMonitor::get shard01 2014-09-24T14:14:40.737-0400 [conn1] getHostOrRefresh: 0 2014-09-24T14:14:40.742-0400 [conn1] 0x100484bdb 0x100072c7c 0x100073000 0x100048e37 0x1003ca0d8 0x1003cce71 0x10039cb46 0x100026cd2 0x100028cca 0x1003a5c3e 0x1003a3522 0x1003a3738 0x1000552d8 0x10005a1db 0x10006471d 0x1003b2352 0x10039974a 0x100002c75 0x100448561 0x1004b9c45 0 mongos 0x0000000100484bdb _ZN5mongo15printStackTraceERSo + 43 1 mongos 0x0000000100072c7c _ZN5mongo17ReplicaSetMonitor16getHostOrRefreshERKNS_21ReadPreferenceSettingE + 244 2 mongos 0x0000000100073000 _ZN5mongo17ReplicaSetMonitor18getMasterOrUassertEv + 170 3 mongos 0x0000000100048e37 _ZN5mongo18DBClientReplicaSet11checkMasterEv + 93 4 mongos 0x00000001003ca0d8 _ZN5mongo14getVersionableEPNS_12DBClientBaseE + 1480 5 mongos 0x00000001003cce71 _ZN5mongo14VersionManager18initShardVersionCBEPNS_12DBClientBaseERNS_7BSONObjE + 49 6 mongos 0x000000010039cb46 _ZN5mongo22ShardingConnectionHook8onCreateEPNS_12DBClientBaseE + 758 7 mongos 0x0000000100026cd2 _ZN5mongo16DBConnectionPool13_finishCreateERKSsdPNS_12DBClientBaseE + 226 8 mongos 0x0000000100028cca _ZN5mongo16DBConnectionPool3getERKSsd + 998 9 mongos 0x00000001003a5c3e _ZN5mongo17ClientConnections3getERKSsS2_ + 100 10 mongos 0x00000001003a3522 _ZN5mongo15ShardConnection5_initEv + 126 11 mongos 0x00000001003a3738 _ZN5mongo15ShardConnectionC2ERKNS_5ShardERKSsN5boost10shared_ptrIKNS_12ChunkManagerEEE + 178 12 mongos 0x00000001000552d8 _ZN5mongo27ParallelSortClusteredCursor28setupVersionAndHandleSlaveOkEN5boost10shared_ptrINS_23ParallelConnectionStateEEERKNS_5ShardENS2_IS5_EERKNS_15NamespaceStringERKSsNS2_IKNS_12ChunkManagerEEE + 294 13 mongos 0x000000010005a1db _ZN5mongo27ParallelSortClusteredCursor9startInitEv + 6337 14 mongos 0x000000010006471d _ZN5mongo27ParallelSortClusteredCursor4initEv + 63 15 mongos 0x00000001003b2352 _ZN5mongo8Strategy7queryOpERNS_7RequestE + 2364 16 mongos 0x000000010039974a _ZN5mongo7Request7processEi + 1250 17 mongos 0x0000000100002c75 _ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE + 147 18 mongos 0x0000000100448561 _ZN5mongo17PortMessageServer17handleIncomingMsgEPv + 1681 19 mongos 0x00000001004b9c45 thread_proxy + 229 2014-09-24T14:14:40.742-0400 [conn1] Starting new refresh of replica set shard01 2014-09-24T14:14:40.742-0400 [conn1] creating new connection to:AD-MAC10G.local:27018 2014-09-24T14:14:40.743-0400 [ConnectBG] BackgroundJob starting: ConnectBG 2014-09-24T14:14:40.743-0400 [conn1] warning: Failed to connect to 10.4.100.44:27018, reason: errno:61 Connection refused 2014-09-24T14:14:40.743-0400 [conn1] User Assertion: 13328:connection pool: connect failed AD-MAC10G.local:27018 : couldn't connect to server AD-MAC10G.local:27018 (10.4.100.44), connection attempt failed 2014-09-24T14:14:40.743-0400 [conn1] polling for status of connection to 127.0.0.1:27019, no events 2014-09-24T14:14:40.744-0400 [conn1] warning: No primary detected for set shard01 2014-09-24T14:14:40.744-0400 [conn1] Updating host AD-MAC10G.local:27019 based on ismaster reply: { setName: "shard01", setVersion: 1, ismaster: false, secondary: true, hosts: [ "AD-MAC10G.local:27019", "AD-MAC10G.local:27018" ], me: "AD-MAC10G.local:27019", maxBsonObjectSize: 16777216, maxMessageSizeBytes: 48000000, maxWriteBatchSize: 1000, localTime: new Date(1411582480744), maxWireVersion: 2, minWireVersion: 0, ok: 1.0 } 2014-09-24T14:14:40.744-0400 [conn1] User Assertion: 10009:ReplicaSetMonitor no master found for set: shard01 2014-09-24T14:14:40.744-0400 [conn1] ReplicaSetMonitor::get shard01 2014-09-24T14:14:40.744-0400 [conn1] ReplicaSetMonitor::get shard01 2014-09-24T14:14:40.745-0400 [conn1] dbclient_rs say using secondary or tagged node selection in shard01, read pref is { pref: "secondary only", tags: [ {} ] } (primary : [not cached], lastTagged : [not cached]) 2014-09-24T14:14:40.745-0400 [conn1] selectNodeUsingTags: before _getMonitor(). Last used node: :27017 2014-09-24T14:14:40.745-0400 [conn1] ReplicaSetMonitor::get shard01 2014-09-24T14:14:40.745-0400 [conn1] getHostOrRefresh: 2 2014-09-24T14:14:40.745-0400 [conn1] creating new connection to:AD-MAC10G.local:27019 2014-09-24T14:14:40.746-0400 [ConnectBG] BackgroundJob starting: ConnectBG 2014-09-24T14:14:40.746-0400 [conn1] connected to server AD-MAC10G.local:27019 (10.4.100.44) 2014-09-24T14:14:40.746-0400 [conn1] connected connection! 2014-09-24T14:14:40.746-0400 [conn1] dbclient_rs selecting node AD-MAC10G.local:27019 2014-09-24T14:14:40.746-0400 [conn1] ReplicaSetMonitor::get shard01 2014-09-24T14:14:40.746-0400 [conn1] pcursor initialized query (lazily) on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false } 2014-09-24T14:14:40.747-0400 [conn1] pcursor finishing over 1 shards 2014-09-24T14:14:40.747-0400 [conn1] ReplicaSetMonitor::get shard01 2014-09-24T14:14:40.747-0400 [conn1] pcursor finishing on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false } 2014-09-24T14:14:40.747-0400 [conn1] pcursor finished on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "(done)", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: { _id: ObjectId('5422075ce9c51acb5562df36') }, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false } 2014-09-24T14:14:40.747-0400 [conn1] Request::process end ns: test.c msg id: 95 op: 2004 attempt: 0 11ms
The easiest way to observe the issue is to use Windows (I used Win2008R2), because it has 1s latency for local connections to a non-listened port:
- Create a sharded cluster
- 1 mongos
- 1 shard - a 2-node RS
- 1 config server
- Insert some test data from mongoS and check its accessibility
db.c.insert({}) rs.slaveOk() db.c.find()
- Shutdown one node in the shard, to make it read-only
- Try to fetch the data again from mongoS:
set verbose true rs.slaveOk() db.c.find()
Observe that it takes 2s for the first, and 1s for all the subsequent attempts to read the data.
- is duplicated by
-
SERVER-17667 Slow performance against Secondary when Primary is down (with auth)
- Closed
- is related to
-
SERVER-16733 mongos does not fail when different configdb string is used
- Closed
- related to
-
SERVER-18671 SecondaryPreferred can end up using unversioned connections
- Closed