Details
-
Improvement
-
Status: Closed
-
Major - P3
-
Resolution: Done
-
2.6.4
-
None
Description
qIn sharded clusters with a read-only shard (imagine that the main DC is offline) new read queries with read preference "secondary" are trying to get the primary first to set the shard version. If primary is down, a replica set refresh is triggered in the function getHostOrRefresh. The problem is that its done inline with the user query, and if the primary is in TCP blackhole, user queries will experience additional latency.
Here is the mongoS log (logLevel 5) with additional logging in getHostOrRefresh and a stack trace for getHostOrRefresh(primary_only):
2014-09-24T14:14:40.735-0400 [conn1] Request::process begin ns: test.c msg id: 95 op: 2004 attempt: 0
|
2014-09-24T14:14:40.736-0400 [conn1] query: test.c { query: {}, $readPreference: { mode: "secondary" } } ntoreturn: 0 options: 4
|
2014-09-24T14:14:40.736-0400 [conn1] pcursor creating pcursor over QSpec { ns: "test.c", n2skip: 0, n2return: 0, options: 4, query: { query: {}, $readPreference: { mode: "secondary" } }, fields: {} } and CInfo { v_ns: "", filter: {} }
|
2014-09-24T14:14:40.736-0400 [conn1] pcursor initializing over 1 shards required by [unsharded @ shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019]
|
2014-09-24T14:14:40.736-0400 [conn1] pcursor initializing on shard shard01q:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: {}, retryNext: false, init: false, finish: false, errored: false }
|
2014-09-24T14:14:40.736-0400 [conn1] ReplicaSetMonitor::createIfNeeded shard01
|
2014-09-24T14:14:40.736-0400 [conn1] ReplicaSetMonitor::get shard01
|
2014-09-24T14:14:40.737-0400 [conn1] getHostOrRefresh: 4
|
2014-09-24T14:14:40.737-0400 [conn1] ReplicaSetMonitor::get shard01
|
2014-09-24T14:14:40.737-0400 [conn1] ReplicaSetMonitor::get shard01
|
2014-09-24T14:14:40.737-0400 [conn1] getHostOrRefresh: 0
|
2014-09-24T14:14:40.742-0400 [conn1] 0x100484bdb 0x100072c7c 0x100073000 0x100048e37 0x1003ca0d8 0x1003cce71 0x10039cb46 0x100026cd2 0x100028cca 0x1003a5c3e 0x1003a3522 0x1003a3738 0x1000552d8 0x10005a1db 0x10006471d 0x1003b2352 0x10039974a 0x100002c75 0x100448561 0x1004b9c45
|
0 mongos 0x0000000100484bdb _ZN5mongo15printStackTraceERSo + 43
|
1 mongos 0x0000000100072c7c _ZN5mongo17ReplicaSetMonitor16getHostOrRefreshERKNS_21ReadPreferenceSettingE + 244
|
2 mongos 0x0000000100073000 _ZN5mongo17ReplicaSetMonitor18getMasterOrUassertEv + 170
|
3 mongos 0x0000000100048e37 _ZN5mongo18DBClientReplicaSet11checkMasterEv + 93
|
4 mongos 0x00000001003ca0d8 _ZN5mongo14getVersionableEPNS_12DBClientBaseE + 1480
|
5 mongos 0x00000001003cce71 _ZN5mongo14VersionManager18initShardVersionCBEPNS_12DBClientBaseERNS_7BSONObjE + 49
|
6 mongos 0x000000010039cb46 _ZN5mongo22ShardingConnectionHook8onCreateEPNS_12DBClientBaseE + 758
|
7 mongos 0x0000000100026cd2 _ZN5mongo16DBConnectionPool13_finishCreateERKSsdPNS_12DBClientBaseE + 226
|
8 mongos 0x0000000100028cca _ZN5mongo16DBConnectionPool3getERKSsd + 998
|
9 mongos 0x00000001003a5c3e _ZN5mongo17ClientConnections3getERKSsS2_ + 100
|
10 mongos 0x00000001003a3522 _ZN5mongo15ShardConnection5_initEv + 126
|
11 mongos 0x00000001003a3738 _ZN5mongo15ShardConnectionC2ERKNS_5ShardERKSsN5boost10shared_ptrIKNS_12ChunkManagerEEE + 178
|
12 mongos 0x00000001000552d8 _ZN5mongo27ParallelSortClusteredCursor28setupVersionAndHandleSlaveOkEN5boost10shared_ptrINS_23ParallelConnectionStateEEERKNS_5ShardENS2_IS5_EERKNS_15NamespaceStringERKSsNS2_IKNS_12ChunkManagerEEE + 294
|
13 mongos 0x000000010005a1db _ZN5mongo27ParallelSortClusteredCursor9startInitEv + 6337
|
14 mongos 0x000000010006471d _ZN5mongo27ParallelSortClusteredCursor4initEv + 63
|
15 mongos 0x00000001003b2352 _ZN5mongo8Strategy7queryOpERNS_7RequestE + 2364
|
16 mongos 0x000000010039974a _ZN5mongo7Request7processEi + 1250
|
17 mongos 0x0000000100002c75 _ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE + 147
|
18 mongos 0x0000000100448561 _ZN5mongo17PortMessageServer17handleIncomingMsgEPv + 1681
|
19 mongos 0x00000001004b9c45 thread_proxy + 229
|
2014-09-24T14:14:40.742-0400 [conn1] Starting new refresh of replica set shard01
|
2014-09-24T14:14:40.742-0400 [conn1] creating new connection to:AD-MAC10G.local:27018
|
2014-09-24T14:14:40.743-0400 [ConnectBG] BackgroundJob starting: ConnectBG
|
2014-09-24T14:14:40.743-0400 [conn1] warning: Failed to connect to 10.4.100.44:27018, reason: errno:61 Connection refused
|
2014-09-24T14:14:40.743-0400 [conn1] User Assertion: 13328:connection pool: connect failed AD-MAC10G.local:27018 : couldn't connect to server AD-MAC10G.local:27018 (10.4.100.44), connection attempt failed
|
2014-09-24T14:14:40.743-0400 [conn1] polling for status of connection to 127.0.0.1:27019, no events
|
2014-09-24T14:14:40.744-0400 [conn1] warning: No primary detected for set shard01
|
2014-09-24T14:14:40.744-0400 [conn1] Updating host AD-MAC10G.local:27019 based on ismaster reply: { setName: "shard01", setVersion: 1, ismaster: false, secondary: true, hosts: [ "AD-MAC10G.local:27019", "AD-MAC10G.local:27018" ], me: "AD-MAC10G.local:27019", maxBsonObjectSize: 16777216, maxMessageSizeBytes: 48000000, maxWriteBatchSize: 1000, localTime: new Date(1411582480744), maxWireVersion: 2, minWireVersion: 0, ok: 1.0 }
|
2014-09-24T14:14:40.744-0400 [conn1] User Assertion: 10009:ReplicaSetMonitor no master found for set: shard01
|
2014-09-24T14:14:40.744-0400 [conn1] ReplicaSetMonitor::get shard01
|
2014-09-24T14:14:40.744-0400 [conn1] ReplicaSetMonitor::get shard01
|
2014-09-24T14:14:40.745-0400 [conn1] dbclient_rs say using secondary or tagged node selection in shard01, read pref is { pref: "secondary only", tags: [ {} ] } (primary : [not cached], lastTagged : [not cached])
|
2014-09-24T14:14:40.745-0400 [conn1] selectNodeUsingTags: before _getMonitor(). Last used node: :27017
|
2014-09-24T14:14:40.745-0400 [conn1] ReplicaSetMonitor::get shard01
|
2014-09-24T14:14:40.745-0400 [conn1] getHostOrRefresh: 2
|
2014-09-24T14:14:40.745-0400 [conn1] creating new connection to:AD-MAC10G.local:27019
|
2014-09-24T14:14:40.746-0400 [ConnectBG] BackgroundJob starting: ConnectBG
|
2014-09-24T14:14:40.746-0400 [conn1] connected to server AD-MAC10G.local:27019 (10.4.100.44)
|
2014-09-24T14:14:40.746-0400 [conn1] connected connection!
|
2014-09-24T14:14:40.746-0400 [conn1] dbclient_rs selecting node AD-MAC10G.local:27019
|
2014-09-24T14:14:40.746-0400 [conn1] ReplicaSetMonitor::get shard01
|
2014-09-24T14:14:40.746-0400 [conn1] pcursor initialized query (lazily) on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
|
2014-09-24T14:14:40.747-0400 [conn1] pcursor finishing over 1 shards
|
2014-09-24T14:14:40.747-0400 [conn1] ReplicaSetMonitor::get shard01
|
2014-09-24T14:14:40.747-0400 [conn1] pcursor finishing on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
|
2014-09-24T14:14:40.747-0400 [conn1] pcursor finished on shard shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019, current connection state is { state: { conn: "(done)", vinfo: "shard01:shard01/AD-MAC10G.local:27018,AD-MAC10G.local:27019", cursor: { _id: ObjectId('5422075ce9c51acb5562df36') }, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
|
2014-09-24T14:14:40.747-0400 [conn1] Request::process end ns: test.c msg id: 95 op: 2004 attempt: 0 11ms
|
The easiest way to observe the issue is to use Windows (I used Win2008R2), because it has 1s latency for local connections to a non-listened port:
- Create a sharded cluster
- 1 mongos
- 1 shard - a 2-node RS
- 1 config server
- Insert some test data from mongoS and check its accessibility
db.c.insert({})
rs.slaveOk()
db.c.find()
- Shutdown one node in the shard, to make it read-only
- Try to fetch the data again from mongoS:
set verbose true
rs.slaveOk()
db.c.find()
Observe that it takes 2s for the first, and 1s for all the subsequent attempts to read the data.
Attachments
Issue Links
- is duplicated by
-
SERVER-17667 Slow performance against Secondary when Primary is down (with auth)
-
- Closed
-
- is related to
-
SERVER-16733 mongos does not fail when different configdb string is used
-
- Closed
-
- related to
-
SERVER-18671 SecondaryPreferred can end up using unversioned connections
-
- Closed
-