[SERVER-20596] Performance regression in new mongos query path (about 17% worse than 3.0) Created: 24/Sep/15  Updated: 06/Dec/22  Resolved: 24/Oct/19

Status: Closed
Project: Core Server
Component/s: Performance, Querying, Sharding
Affects Version/s: 3.1.9
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Rui Zhang (Inactive) Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Incomplete Votes: 2
Labels: sys-perf-reg
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File BD-1193-a.png    
Issue Links:
Depends
depends on SERVER-21436 Build a TaskExecutor that uses the Ne... Closed
Related
related to SERVER-20763 Performance regression in find comman... Closed
related to SERVER-20194 Enable new cursor manager path in mon... Closed
related to SERVER-21441 Optimize shard targeting for equality... Closed
is related to SERVER-20853 Eliminate copies of result batch in f... Closed
is related to SERVER-20944 Contention on ThreadPoolTaskExecutor:... Closed
Assigned Teams:
Sharding
Operating System: ALL
Sprint: QuInt B (11/02/15), QuInt C (11/23/15)
Participants:
Case:
Linked BF Score: 0

 Description   

During the 3.1 development cycle, the implementation of find and getMore on mongos was rewritten under SERVER-15176. This new mongos query path plugs into the TaskExecutor event loop framework. It issues find, getMore, and killCursors commands to the shards using the new asynchronous networking layer. Our automated industry benchmarks performance loop shows that the new query path is about 17% slower than its 3.0 predecessor. This ticket tracks the work to achieve parity with 3.0 on the industry benchmark workload, which could involve optimizations to the asynchronous networking code, the task executor layer, or the new query path itself.

Original Description

Read performance regression found in system-perf EVG test. This is due to following change:
https://github.com/mongodb/mongo/commit/b1982bb7fb610def9b23ab08b0317e6f409c1784

this enable new mongos query path for legacy read, therefore, here we are showing the performance issue with new path.

A single findOne test with a single sharded cluster with benchRun

results for b1982b:

mongos> db.serverBuildInfo().gitVersion
b1982bb7fb610def9b23ab08b0317e6f409c1784
....
 
All Results:
+--------------------------------+----------+--------------+----------+------------------------------+
| Test                           | Thread   | Throughput   | Pass?    | Comment                      |
|--------------------------------+----------+--------------+----------+------------------------------|
| "findOne"                      |       32 | 13236.845747 | true     | ""                           |
| "findOne"                      |       64 | 12156.284979 | true     | ""                           |
+--------------------------------+----------+--------------+----------+------------------------------+

32e55, the parent SHA

All Results:
+--------------------------------+----------+--------------+----------+------------------------------+
| Test                           | Thread   | Throughput   | Pass?    | Comment                      |
|--------------------------------+----------+--------------+----------+------------------------------|
| "findOne"                      |       32 | 35064.731049 | true     | ""                           |
| "findOne"                      |       64 | 35064.437971 | true     | ""                           |
+--------------------------------+----------+--------------+----------+------------------------------+



 Comments   
Comment by Sheeri Cabral (Inactive) [ 24/Oct/19 ]

Closing, tests showed it wasn't query path related, and the purported regression is from a years-old version.

Comment by David Storch [ 23/Nov/15 ]

We've made substantial performance improvements under the various linked tickets. However, we have not yet achieved parity with version 3.0 on the industry benchmarks workload. I have updated the title and description of this ticket to reflect our progress. Bumping this ticket into 3.3 Required, since the current perf is sufficient for the 3.2.0 release.

Comment by Githook User [ 09/Nov/15 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}

Message: SERVER-20596 Remove unhelpful timer that shows up in mongos profiles.
Branch: master
https://github.com/mongodb/mongo/commit/7fc05bad65a96cba3a08eb79d1a7060199985e11

Comment by Githook User [ 05/Nov/15 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-20596 don't access ClusterCursorManager for single-batch queries
Branch: master
https://github.com/mongodb/mongo/commit/5edafdbf6ca1effcb18d62c8e53b37544afecfcc

Comment by Githook User [ 22/Oct/15 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-20596 do less work inside the ClusterCursorManager lock
Branch: master
https://github.com/mongodb/mongo/commit/4a53cc3e6acb4d4486a6f126151ed5eb2d189a86

Generated at Thu Feb 08 03:54:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.