[SERVER-55535] Performance tests to exercise change streams optimizations Created: 25/Mar/21  Updated: 29/Oct/23  Resolved: 25/Oct/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.2.0, 5.0.5, 5.1.1

Type: Improvement Priority: Major - P3
Reporter: Justin Seyster Assignee: Justin Seyster
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-52283 Enable feature flag for Allow $change... Closed
is depended on by SERVER-60138 Improve the allocated memory for getM... Closed
Related
is related to SERVER-48694 Push down user-defined stages in a ch... Closed
is related to SERVER-56872 Add optimization function to apply $m... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.1, v5.0
Sprint: QE 2021-10-18, QE 2021-11-01
Participants:

 Comments   
Comment by Bernard Gorman [ 01/Oct/22 ]

Hi ywu@stripe.com,

The phrase "change streams are optimized" is referring to the work done to resolve SERVER-48694 and SERVER-56872. Prior to 5.1, any $match or $project stages that the user added to the change stream aggregation were only applied at the very end of the pipeline, meaning that the stream had to read, process and transform every potentially-relevant event before filtering out only the ones the user was actually interested in. This inefficiency was particularly pronounced in a sharded cluster, since the $match and $project stages had to run on mongoS; the stream would therefore send all events from the shards over the network to mongoS, and would only then discard the ones that the user did not want. An explanation of why this was necessary is provided in this comment on SERVER-48694.

Starting in MongoDB 5.1, if the change stream pipeline contains $match and $project stages, we will push those stages down to execute on the shards, so that only the subset of results that the user requested are returned to mongoS. Additionally, we examine the user's $match stage and attempt to rewrite as much of the filter as possible to apply directly to the oplog scan at the very beginning of the change stream pipeline; this allows us to filter out events before any processing or transformation is applied to them. For selective filters that can be rewritten into the oplog, this can result in a significant performance improvement relative to 5.0 and earlier.

Hope this helps to clarify the changes in 5.1!

Best regards,
Bernard

Comment by Yang Wu [ 30/Sep/22 ]

Hi, I see on the document that mentions:

Starting in MongoDB 5.1, change streams are optimized, providing more efficient resource utilization and faster execution of some aggregation pipeline stages.

Could you share any insights around what were the issue in pre-5.1 and what were the optimizations? I only found this one related to memory usage https://jira.mongodb.org/browse/SERVER-36346 – is there some other perf issues?

 

Thanks!

Comment by Githook User [ 12/Nov/21 ]

Author:

{'name': 'Justin Seyster', 'email': 'justin.seyster@mongodb.com', 'username': 'jseyster'}

Message: SERVER-55535 Performance tests to exercise change streams optimizations

(cherry picked from commit 04c9b53b185df98de8e5dfda57420411e59e9cad)
Branch: v5.1
https://github.com/mongodb/mongo/commit/ce992d046993dd88d3993aa4c7311c8a6550d562

Comment by Githook User [ 04/Nov/21 ]

Author:

{'name': 'Justin Seyster', 'email': 'justin.seyster@mongodb.com', 'username': 'jseyster'}

Message: SERVER-55535 Performance tests to exercise change streams optimizations

(cherry picked from commit 04c9b53b185df98de8e5dfda57420411e59e9cad)
Branch: v5.0
https://github.com/mongodb/mongo/commit/43a930d200a4e131d91294236949b3de2e114b85

Comment by Githook User [ 22/Oct/21 ]

Author:

{'name': 'Justin Seyster', 'email': 'justin.seyster@mongodb.com', 'username': 'jseyster'}

Message: SERVER-55535 Performance tests to exercise change streams optimizations
Branch: master
https://github.com/mongodb/mongo/commit/04c9b53b185df98de8e5dfda57420411e59e9cad

Generated at Thu Feb 08 05:36:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.