[SERVER-32115] Low mongos performance with adaptive Service Executor Created: 29/Nov/17  Updated: 06/Dec/22  Resolved: 08/Jul/19

Status: Closed
Project: Core Server
Component/s: Networking
Affects Version/s: 3.6.0-rc4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Josef Ahmad Assignee: Backlog - Service Architecture
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File SERVER-32115.tar.gz     Java Source File load.java     File make-sh.js    
Assigned Teams:
Service Arch
Operating System: ALL
Steps To Reproduce:
  • Set up a two-shard cluster.
  • On my mongos I passed --setParameter ShardingTaskExecutorPoolMaxSize=100 --setParameter taskExecutorPoolSize=4 (to avoid hitting hard ulimit on open files)
  • Pre-spit the sharded collection with the make-sh.js I attached and wait for the balancing to complete
  • $ java -cp load:mongo-java-driver/mongo-java-driver/build/libs/mongo-java-driver-3.5.0.jar load host=172.31.43.41 op=ramp_query docs=1000000 threads=500 tps=10 qps=1000 dur=600 size=100 updates=10
Participants:

 Description   

Reproduced on version r3.6.0-rc4-41-ge608b8b349.

On a two-shard environment, I generated a 100% query workload via the load.java attached (which is a modified version of load.java from SERVER-30613).

Configurations tested:
A. The sharded cluster operating with synchronous (default) service executor.
B. The sharded cluster operating with asynchronous (adaptive) service executor.
C. The sharded cluster operating with asynchronous (adaptive) service executor, except for mongos which was operating in synchronous mode.

Performance of A and C is comparable, whereas B shows 73% the query throughput of A.

For the test I used four AWS EC2 m4.4xlarge. One machine for the traffic generator, one for mongos and 1-member config server, and one machine for each shard (1-member replica set).

Attached logs and diagnostic data for the three configurations.


Generated at Thu Feb 08 04:29:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.