[SERVER-17411] 20% drop in Update perf for MMAPv1 vs 2.6 MMAP Created: 27/Feb/15  Updated: 06/Dec/22  Resolved: 14/Sep/18

Status: Closed
Project: Core Server
Component/s: MMAPv1
Affects Version/s: 3.0.0-rc11
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Alvin Richards (Inactive) Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 1
Labels: 28qa, cap-ss
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File SERVER-17411-2.6.7-mmapv0-c1.tgz     File SERVER-17411-3.0.0-rc11-mmapv1-c1.tgz    
Issue Links:
Duplicate
is duplicated by SERVER-17341 update on single doc is slower than 2... Closed
is duplicated by SERVER-17342 20% drop in throughput on a contended... Closed
Related
related to SERVER-17342 20% drop in throughput on a contended... Closed
Assigned Teams:
Storage Execution
Operating System: ALL
Steps To Reproduce:

/home/ec2-user/mongodb-linux-x86_64-3.0.0-rc11/bin/mongod --dbpath /data2/db --logpath /data3/logs/db/simple_update_mms/server.log --fork --syncdelay 14400 --storageEngine=mmapv1 --bind_ip 127.0.0.1

python benchrun.py -f testcases/simple_update_mms.js -t 12 14 16 -l SERVER-17342-2-3.0.0-rc11-mmapv1-c1 --rhost "54.191.70.12" --rport 27017 -s /home/ec2-user/mongo-perf-shell/mongo --writeCmd true --tria
lCount 7 --trialTime 20 --testFilter \'update\' -c 1 --dyno

Participants:

 Description   

Problem

This may be a more general case of SERVER-17342, but there appears to be a general drop on all mongo-perf Update tests in the order of 10-20%. This occurs on contested and uncontested updates, large and small document updates etc.



 Comments   
Comment by François Doray [ 06/Mar/15 ]

I generated flame graphs [1] of the execUpdate method of v2.6.7 and v3rc9 while running the MultiUpdate.v1.Contended.Doc.Seq.Indexed benchmark (-t 4 --trialCount 7 --trialTime 20). You can click on stack frames to get full function names. You can also click on the "Filter" link (bottom right of the page) to get a more detailed view of the calles of a function.

To generate the flame graphs, call stacks samples were captured at 250 Hz using an experimental feature of LTTng [2] (detailed documentation available soon). This allows to approximate the total duration per call stack for the complete execution of the benchmark. Then, in order to allow a meaningfull comparison, all durations were divided by the number of executions of the WriteBatchExecutor::execUpdate method. In the flame graphs that I shared with you, off-cpu latency was not included (otherwise, the wait on Lock::DBWrite in v2.6.7 and Lock::CollectionLock in v3 would have dominated the data). Also, a ram disk was used to avoid io latency that can be very important with an HDD.

I hope this data will help you solve this issue. I'm currently writing documentation so that you can generate these comparison flame graphs by yourself.

[1] http://fdoray.github.io/tracecompare/demo/mongo-SERVER-17411/?data=mongo
[2] https://github.com/fdoray/lttng-profile and https://github.com/fdoray/lttng-profile-modules

Generated at Thu Feb 08 03:44:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.