[SERVER-32700] Performance regression with latest release (r3.6.2) compared to (r3.4.9) Created: 15/Jan/18  Updated: 06/Dec/22  Resolved: 24/Jun/21

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 3.6.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Praveen Arkeri Assignee: Backlog - Triage Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Server Triage
Operating System: ALL
Steps To Reproduce:

Followed YCSB benchmark user guide available here --> Running-YCSB

Benchmark Configuration:

  • Client and server run on same machine
  • Server configured to use tmpfs (inRAM) for all data storage (avoiding disk usage)
  • Client thread count = 96
  • Record Count = 5000000
Participants:

 Description   

I use YCSB benchmark to evaluate MongoDB performance on localhost machine and I am seeing performance regression with latest releases (r3.6.x) compared to r3.4.9.

For YCSB-Core-Workloads I am seeing ~5% drop in performance on Broadwell and Centriq platforms.

I wanted to know if this is a known performance issue and are there any investigations pending to improve performance with latest release (r3.6.2)



 Comments   
Comment by Praveen Arkeri [ 08/Feb/18 ]

@Kelsey,

Thanks for your reply, will follow tickets listed above to get updates on performance improvements.

-Praveen

Comment by Kelsey Schubert [ 08/Feb/18 ]

Hi Arkeri,

Sorry for the delay getting back to you. We're actively working on a number of performance improvements that we expect to arrive in 3.6 soon. Given these known issues, we'd like to focus on getting these improvements into 3.6 where we can test their impact on the workload you describe and determine whether this issue has been resolved.

Please feel free to review or watch these tickets which track some of the work that is being done to improve performance:

WT-3816
WT-3768
WT-3767
WT-3854
WT-3805
WT-3766

Some of these tickets have already been backported and will be available in the next release, MongoDB 3.6.3.

Kind regards,
Kelsey

Comment by Praveen Arkeri [ 05/Feb/18 ]

Do we have any update on investigation related to performance regression with latest release (r3.6.2)?

Also using below test from mongo-perf I see similar performance degradation on Broadwell and Centriq platforms for various threads.

python benchrun.py -f testcases/simple_insert.js -t 1 2 4 8 16 24 32 48 64 96 128 -s /path/to/mongodb-r3.6.2/bin/mongo

Comment by Praveen Arkeri [ 22/Jan/18 ]

Benchmark config used:
Starting Server:

mount -t tmpfs -o rw,size=20000m tmpfs /mnt
mkdir /mnt/data
$MONGO_HOME/bin/mongod --dbpath=/mnt/data/

YCSB run commands:
$YCSB_HOME/bin/ycsb load mongodb -P workloads/workloada -p recordcount=5000000 -threads 96 -s;
$YCSB_HOME/bin/ycsb run mongodb -P workloads/workloada -p operationcount=5000000 -threads 96 -s;

Comment by Praveen Arkeri [ 22/Jan/18 ]

I have uploaded diagnostic.data with an archive named 'centriq2400_broadwell2699_ycsb_mongodb_diagnostic_data.zip', let me know if you can access.

Comment by Mark Agarunov [ 18/Jan/18 ]

Hello Arkeri,

Thank you for the response. The diagnostic.data directory does not contain any user data. It periodically collects the output of the following commands, which you are welcome to execute yourself to examine the output.

serverStatus: db.serverStatus({tcmalloc: true})
replSetGetStatus: rs.status()
collStats for local.oplog.rs: db.getSiblingDB('local').oplog.rs.stats()
getCmdLineOpts: db.adminCommand({getCmdLineOpts: true})
buildInfo: db.adminCommand({buildInfo: true})
hostInfo: db.adminCommand({hostInfo: true})

However, since you have expressed some concern, I've gone ahead and created a secure upload portal for you to use.. Files uploaded to this portal are only visible to MongoDB employees and are routinely deleted after some time.

To use this data we have internal tools to visualize and interpret the data.

Thanks,
Mark

Comment by Praveen Arkeri [ 18/Jan/18 ]

Hi MarkA

Few questions on diagnostic.data:

  • For me it looks like data is in binary format, what kind of data is stored in this?
  • Which tool do you use to extract required information?

Thanks
Praveen

Comment by Mark Agarunov [ 16/Jan/18 ]

Hello Arkeri,

Thank you for the report. To get some more insight into this issue, could you please provide the following:

  • Please archive (tar or zip) the $dbPath/diagnostic.data directory
  • The exact configuration you are using for ycsb and the command it is being run with.

This should provide some information to help diagnose this.

Thanks,
Mark

Comment by Praveen Arkeri [ 15/Jan/18 ]

Also I see there is ~35% increase in binary size from ~650MB (r3.4.9) to ~950MB (r3.6.2) for both x86_64 and aarch64 when compiled with GCC-6.3

Generated at Thu Feb 08 04:31:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.