[SERVER-16385] significant regression in $push performance in 2.6.5 compared to 2.4.3 Created: 02/Dec/14  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Performance, Write Ops
Affects Version/s: 2.6.5
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: John Greenall Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 1
Labels: query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File 243noPow2.svg     File 243usePow2.svg     File 265usePow2.svg     File mongoSubdocumentPushJira.py     File noPow265.svg     File testTimings.svg    
Issue Links:
Related
Assigned Teams:
Query Execution
Participants:

 Description   

I have been doing some benchmarking ahead of the migration of our live system from 2.4.3 to 2.6.5. The first item for concern that my testing has highlighted is that there appears to have been a significant regression in performance in Mongo's ability to cope with multiple pushes of subdocuments.
My test scenario is as follows:
I have two identically configured servers in the same availability zone in AWS (of instance type m1.large). One is running 2.4.3 and the other is 2.6.5.
I have another larger box (hi1.4xlarge) in the same availability zone from which I execute the test first on one database, then the other.
The test involves making 100k updates with a push of a single small subdocument to the same collection and measuring the time it takes when pushing 20, 100, 1000, 5000, 10000 subdocs per doc. The collection has several indices including a unique index on a UUID field.
The test proceeds as follows:
empty the collection
generate appropriate number of UUIDs across which to spread 100k subdocs i.e. 100k / 20 = 5000 for the first test
make 100k updates (with upsert=True) as fast as possible and log time taken / database stats

From my scatter plot of results, it seems that the test runs fastest with 100 subdocs per doc (presumably because of the cost of updating the indices on the upsert of the larger number of documents dominates the time taken for the push with 20 subdocs per doc). Also it is clear that performance in 2.6.5 has been improved at this end of the scale.
However, as the number of pushes per document is increased, the performance of 2.6.5 degrades much faster than 2.4.3. I tried both Mongo versions with and without UsePowerOf2Sizes but found the effect of this parameter to be negligible (which was surprising having read the documentation on this feature). Are there any other parameters that affect the allocation of space that could be negatively affecting the performance of 2.6.5 in this test?

I attach scatter plot of timings, server stats from 2.4.3 and 2.6.5 when pushing 10k subdocs per doc and the python code used for testing (minus all the data logging stuff).



 Comments   
Comment by Andrew Morrow (Inactive) [ 09/Dec/14 ]

I'm happy to hear that this isn't a showstopper for you, and if you find other performance regressions during your testing we of course would be interested in hearing your findings.

I'm going to put this ticket the '2.9 desired' category, which means we that we intend to address it in the next release after 2.8.

Comment by John Greenall [ 09/Dec/14 ]

Andrew thanks for the update. This is most helpful.

I don't think this issue is on its own a show-stopper for us. We currently do quite a lot of this kind of pushing of subDocuments but we have reworked the pipeline for an upcoming release to do some aggregation of subDocuments before they get inserted into mongoDB as we already had this flagged as performance bottleneck. The rework of the pipeline depends on $min / $max operators and hence we have been hanging fire with pushing this release out until we were happy that 2.6 was stable.

I'll continue with benchmarking now I'm happy that you've been able to verify my findings and as long as performance elsewhere hasn't regressed elsewhere we should still be able to proceed with our migration to 2.6.

Certainly though the preallocation of space for subdocuments is an area where we'd like to either see a bit more built-in intelligence or else be given a bit more control. As stated, the powerOf2Sizes didn't seem to have much impact for me.

Comment by Andrew Morrow (Inactive) [ 09/Dec/14 ]

Hi John -

We are able to reproduce a performance regression here between 2.4 and 2.6. I also tested against 2.8-rc2, which exhibits the same performance regression. We have not yet determined the root cause of the performance regression, but it seems likely that it is related to the refactoring of the update subsystem undertaken in 2.6.

We will need to undertake further investigation to determine a root cause and understand how to fix it, as well as evaluate the risk of applying such a backport to the 2.6 release, which is now quite stable. It would be helpful if you could provide us with some information about how critical a fix to this issue is for you.

Thanks,
Andrew

Thanks,
Andrew

Comment by John Greenall [ 09/Dec/14 ]

Any preliminary thoughts on this issue guys? I have more testing I'd like to do but don't want to push on with it until I have some feedback on this in case you're unable to recreate my numbers and it turns out there is something flawed with my test setup.
Best,
John

Generated at Thu Feb 08 03:40:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.