[SERVER-34065] MongoS read throughput regression between 3.4 and 3.6 Created: 22/Mar/18  Updated: 07/Jan/21  Resolved: 07/Jan/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Kaloian Manassiev
Resolution: Won't Do Votes: 1
Labels: sharding-causes-bfs-hard
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File read_only_perf_test.js    
Issue Links:
Depends
Related
Operating System: ALL
Participants:
Linked BF Score: 0

 Description   

Using a single shard and single mongos shows ~17% read throughput decrease and ~20% read latency increase between 3.4.13 and 3.6.3. The attached read_only_perf_test.js script contains a minimal repro of this regression, but it is also visible in the YCSB 100% reads benchmark.

3.4.13:
 20 sec: 783760 operations; 39908.2 current ops/sec; [READ AverageLatency(us)=998.61]
 30 sec: 1177532 operations; 39373.26 current ops/sec; [READ AverageLatency(us)=1012.16]
 40 sec: 1577481 operations; 39994.9 current ops/sec; [READ AverageLatency(us)=996.42]
 50 sec: 1979463 operations; 40194.18 current ops/sec; [READ AverageLatency(us)=991.48]
 60 sec: 2372148 operations; 39268.5 current ops/sec; [READ AverageLatency(us)=1014.99]
 70 sec: 2768010 operations; 39582.24 current ops/sec; [READ AverageLatency(us)=1006.98]
 80 sec: 3141892 operations; 37388.2 current ops/sec; [READ AverageLatency(us)=1066.3]
 
3.6.3:
 20 sec: 646764 operations; 33782.6 current ops/sec; [READ AverageLatency(us)=1180.32]
 30 sec: 981056 operations; 33425.86 current ops/sec; [READ AverageLatency(us)=1192.9]
 40 sec: 1307640 operations; 32658.4 current ops/sec; [READ AverageLatency(us)=1221.14]
 50 sec: 1643276 operations; 33560.24 current ops/sec; [READ AverageLatency(us)=1188.12]
 60 sec: 1977689 operations; 33441.3 current ops/sec; [READ AverageLatency(us)=1192.55]
 70 sec: 2299412 operations; 32169.08 current ops/sec; [READ AverageLatency(us)=1239.63]
 80 sec: 2625286 operations; 32587.4 current ops/sec; [READ AverageLatency(us)=1223.75]

If only mongos is downgraded to 3.4.13, with the shard still at 3.6.3, the performance is back to normal, so the problem must be on mongos.

Same problem is visible between 3.4.13 and master (at the time of 3.7.3):

3.4:
{
        "note" : "values per second",
        "errCount" : NumberLong(0),
        "trapped" : "error: not implemented",
        "findOneLatencyAverageMicros" : 768.6047960983534,
        "totalOps" : NumberLong(1541503),
        "totalOps/s" : 51348.335034725926,
        "findOne" : 51348.335034725926,
        "insert" : 0,
        "delete" : 0,
        "update" : 0,
        "query" : 0,
        "command" : 0
}
 
master:
{
        "note" : "values per second",
        "errCount" : NumberLong(0),
        "trapped" : "error: not implemented",
        "findOneLatencyAverageMicros" : 987.8736065215004,
        "totalOps" : NumberLong(1192609),
        "totalOps/s" : 39725.913914800716,
        "findOne" : 39725.913914800716,
        "insert" : 0,
        "delete" : 0,
        "update" : 0,
        "query" : 0,
        "command" : 0
}



 Comments   
Comment by Kaloian Manassiev [ 07/Jan/21 ]

Given the time that it has passed since 3.4/3.6, this ticket is now more of a matter of improving the performance of MongoS rather than fixing a specific bug which caused a regression. Since we are working on laying out a set of projects to improve the performance of MongoS, I am closing this ticket as Won't Do.

Comment by Sheeri Cabral (Inactive) [ 30/Jan/20 ]

ratika.gandhi we are adding this to quick wins, we can prioritize next sync.

Comment by Charlie Swanson [ 25/Jan/19 ]

My above theory about compression protocols did not pan out.

This is actually fascinating. I spoke with esha.maharishi a bit to generate some hypotheses, and we found that 3.5.2 performed much worse than the tip of 3.6.

After discovering that, I was able to do a manual git bisect to pin down a regression for this commit. Running locally, before that commit I see about 70,000 ops/second. After that commit, it goes down to about 17,000 ops/second.

I don't really understand what's going on in that commit or how it could produce this regression, so I'm sending this over to Kal for a look.

Here's a spreadsheet I generated during my git bisect that might help.

As you can see there, we did eventually recover much of the performance after that commit, so I'm not sure if this was something that later got reverted or what.

Comment by Charlie Swanson [ 18/Jan/19 ]

This is actually interesting. I don't see any indication that the cursor establishment logic has changed between 3.4 and 3.6 - on both 3.4 and 3.6 the first command comes back with a non-empty "firstBatch" and a cursorid: 0. However, on 3.6 I do see this when I set the networking log level to 3:

2019-01-18T11:37:19.262-0500 D ASIO     [conn2] startCommand: RemoteCommand 374 -- target:localhost:27019 db:test cmd:{ find: "foo", filter: { _id: { $lte: 10.0 } }, limit: 1, shardVersion: [ Timestamp(3, 3), ObjectId('5c4200a845088e9e8f0f2f7e') ] }
2019-01-18T11:37:19.262-0500 D ASIO     [conn2] startCommand: RemoteCommand 376 -- target:localhost:27020 db:test cmd:{ find: "foo", filter: { _id: { $lte: 10.0 } }, limit: 1, shardVersion: [ Timestamp(3, 5), ObjectId('5c4200a845088e9e8f0f2f7e') ] }
2019-01-18T11:37:19.262-0500 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Connecting to localhost:27019
2019-01-18T11:37:19.263-0500 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Connecting to localhost:27020
2019-01-18T11:37:19.263-0500 D ASIO     [conn2] startCommand: RemoteCommand 378 -- target:localhost:27021 db:test cmd:{ find: "foo", filter: { _id: { $lte: 10.0 } }, limit: 1, shardVersion: [ Timestamp(3, 7), ObjectId('5c4200a845088e9e8f0f2f7e') ] }
2019-01-18T11:37:19.263-0500 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Connecting to localhost:27021
2019-01-18T11:37:19.263-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting client-side compression negotiation
2019-01-18T11:37:19.263-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Offering snappy compressor to server
2019-01-18T11:37:19.263-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting asynchronous command 375 on host localhost:27019
2019-01-18T11:37:19.263-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting client-side compression negotiation
2019-01-18T11:37:19.263-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Offering snappy compressor to server
2019-01-18T11:37:19.263-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting asynchronous command 377 on host localhost:27020
2019-01-18T11:37:19.263-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting client-side compression negotiation
2019-01-18T11:37:19.263-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Offering snappy compressor to server
2019-01-18T11:37:19.263-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting asynchronous command 379 on host localhost:27021
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Finishing client-side compression negotiation
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Received message compressors from server
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Adding compressor snappy
2019-01-18T11:37:19.264-0500 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Successfully connected to localhost:27020, took 1ms (1 connections now open to localhost:27020)
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Request 377 finished with response: {}
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Initiating asynchronous command: RemoteCommand 376 -- target:localhost:27020 db:test cmd:{ find: "foo", filter: { _id: { $lte: 10.0 } }, limit: 1, shardVersion: [ Timestamp(3, 5), ObjectId('5c4200a845088e9e8f0f2f7e') ] }
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Compressing message with snappy
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting asynchronous command 376 on host localhost:27020
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Finishing client-side compression negotiation
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Received message compressors from server
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Adding compressor snappy
2019-01-18T11:37:19.264-0500 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Successfully connected to localhost:27019, took 2ms (1 connections now open to localhost:27019)
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Request 375 finished with response: {}
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Initiating asynchronous command: RemoteCommand 374 -- target:localhost:27019 db:test cmd:{ find: "foo", filter: { _id: { $lte: 10.0 } }, limit: 1, shardVersion: [ Timestamp(3, 3), ObjectId('5c4200a845088e9e8f0f2f7e') ] }
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Compressing message with snappy
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting asynchronous command 374 on host localhost:27019
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Finishing client-side compression negotiation
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Received message compressors from server
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Adding compressor snappy
2019-01-18T11:37:19.264-0500 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Successfully connected to localhost:27021, took 1ms (1 connections now open to localhost:27021)
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Request 379 finished with response: {}
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Initiating asynchronous command: RemoteCommand 378 -- target:localhost:27021 db:test cmd:{ find: "foo", filter: { _id: { $lte: 10.0 } }, limit: 1, shardVersion: [ Timestamp(3, 7), ObjectId('5c4200a845088e9e8f0f2f7e') ] }
2019-01-18T11:37:19.264-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Compressing message with snappy
2019-01-18T11:37:19.264-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Starting asynchronous command 378 on host localhost:27021
2019-01-18T11:37:19.265-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Decompressing message with snappy
2019-01-18T11:37:19.265-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Request 376 finished with response: { cursor: { firstBatch: [ { _id: 4.0 } ], id: 0, ns: "test.foo" }, ok: 1.0, operationTime: Timestamp(1547829436, 2), $gleStats: { lastOpTime: Timestamp(0, 0), electionId: ObjectId('7fffffff0000000000000001') }, $configServerState: { opTime: { ts: Timestamp(1547829438, 1), t: 1 } }, $clusterTime: { clusterTime: Timestamp(1547829438, 1), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } } }
2019-01-18T11:37:19.265-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Decompressing message with snappy
2019-01-18T11:37:19.265-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Request 374 finished with response: { cursor: { firstBatch: [], id: 0, ns: "test.foo" }, ok: 1.0, operationTime: Timestamp(1547829426, 12), $gleStats: { lastOpTime: Timestamp(0, 0), electionId: ObjectId('7fffffff0000000000000001') }, $configServerState: { opTime: { ts: Timestamp(1547829438, 1), t: 1 } }, $clusterTime: { clusterTime: Timestamp(1547829438, 1), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } } }
2019-01-18T11:37:19.265-0500 D NETWORK  [NetworkInterfaceASIO-TaskExecutorPool-9-0] Decompressing message with snappy
2019-01-18T11:37:19.265-0500 D ASIO     [NetworkInterfaceASIO-TaskExecutorPool-9-0] Request 378 finished with response: { cursor: { firstBatch: [ { _id: 5.0 } ], id: 0, ns: "test.foo" }, ok: 1.0, operationTime: Timestamp(1547829420, 1), $gleStats: { lastOpTime: Timestamp(0, 0), electionId: ObjectId('7fffffff0000000000000001') }, $configServerState: { opTime: { ts: Timestamp(1547829438, 1), t: 1 } }, $clusterTime: { clusterTime: Timestamp(1547829438, 1), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } } }

It looks like perhaps the negotiation of the compression protocol is causing an extra round trip? As an experiment, I'll see if I can disable that and show a performance improvement.

Comment by Kaloian Manassiev [ 24/Sep/18 ]

Given that is in the find path and Esha's explanation I think it is most appropriate that this is handled by the query team.

david.storch, let me know if you disagree or if you need some other support from the sharding team.

Comment by Esha Maharishi (Inactive) [ 21/Sep/18 ]

It would need more investigation, but what I was thinking is that, as of 3.6, executing a find on a router has two phases:

So, returning results for a find might require two round trips to the shard - I don't remember if:

1) the establish cursors phase uses a batchSize of 0 or not

2) if not, whether the results from the establish cursors phase are put into the ClusterClientCursor's stash so that the results are returned immediately... or alternatively whether the ARM returns results immediately if some remotes have already returned results

If the regression is only on a sharded cluster, it's probably something to do with the changes to the find path in 3.6.

Comment by Gregory McKeon (Inactive) [ 21/Sep/18 ]

esha.maharishi kaloian.manassiev can you clarify your comment here? Do we have a server ticket for the actual fix for the regression?

Comment by Kaloian Manassiev [ 13/Apr/18 ]

According to Esha, this might be caused by findOne doing extra round trip.

Generated at Thu Feb 08 04:35:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.