<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 06:55:54 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-84755] Investigate throughput reduction for large number of documents</title>
                <link>https://jira.mongodb.org/browse/SERVER-84755</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;The &lt;a href=&quot;https://github.com/mongodb/mongo-perf/blob/master/testcases/simple_bigcollection.js&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;BigCollection benchmark&lt;/a&gt; in the mongo-perf runs multiple tests where the total data size of the collection is kept the same, while the number of documents increases and their size decreases by the same factor. Both Scan and Filter queries reveal a cliff in the throughput for the collection with the largest number of documents (1638400). This holds both for classic and SBE engines.&lt;/p&gt;

&lt;p&gt;From the investigation in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-80583&quot; title=&quot;Microbenchmarks - BigCollection - investigate regression in Filter benchmarks&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-80583&quot;&gt;&lt;del&gt;SERVER-80583&lt;/del&gt;&lt;/a&gt; on VM, 1-thread throughput in ops-per-sec&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;Document number&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;Document size&#160;&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;Batch &lt;br/&gt;
size&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;Classic&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;SBE&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;25&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;16777216&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.286&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.152&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;400&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1048576&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.781&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.505&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6400&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;65536&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.788&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.583&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;102400&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4096&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.248&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.145&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1638400&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;256&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;font color=&quot;#ff0000&quot;&gt;0.702&lt;/font&gt;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;font color=&quot;#ff0000&quot;&gt;0.799&lt;/font&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;400&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1048576&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.937&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.823&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6400&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;65536&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;16&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.846&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.825&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;102400&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4096&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;256&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.358&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.432&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1638400&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;256&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4096&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;font color=&quot;#ff0000&quot;&gt;0.745&lt;/font&gt;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;font color=&quot;#ff0000&quot;&gt;0.901&lt;/font&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;This seems to be partially due to the WiredTiger, and partially due to the predicate computation ( higher computational cost for larger number of documents). Excerpt from the flame graphs in the attachment for the SBE engine:&lt;/p&gt;

&lt;p&gt;PlanExecutorSBE::getNext :&#160; 1.89% vs. 44.73%&lt;/p&gt;

&lt;p&gt;FilterStage::getNext : 1.67% vs. 40.77%&lt;/p&gt;

&lt;p&gt;WiredTigerRecordStoreCursorBase::next : 0.93% vs. 23.84%&lt;/p&gt;

&lt;p&gt;sbe::vm::ByteCode::runPredicate : 0.24% vs 7.46%&lt;/p&gt;</description>
                <environment></environment>
        <key id="2543112">SERVER-84755</key>
            <summary>Investigate throughput reduction for large number of documents</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="10038" iconUrl="https://jira.mongodb.org/images/icons/subtask.gif" description="">Backlog</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="backlog-query-optimization">Backlog - Query Optimization</assignee>
                                    <reporter username="milena.ivanova@mongodb.com">Milena Ivanova</reporter>
                        <labels>
                    </labels>
                <created>Thu, 11 Jan 2024 11:19:49 +0000</created>
                <updated>Tue, 23 Jan 2024 14:41:48 +0000</updated>
                                                                                                <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                        <attachments>
                            <attachment id="503550" name="Filter.BC.1638400.4096.sbe.svg" size="380533" author="milena.ivanova@mongodb.com" created="Tue, 16 Jan 2024 14:40:42 +0000"/>
                            <attachment id="503552" name="Filter.BC.6400.16.sbe.svg" size="333101" author="milena.ivanova@mongodb.com" created="Tue, 16 Jan 2024 14:41:13 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25126"><![CDATA[Query Optimization]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        3 weeks, 6 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-3223</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>anton.korshunov@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            3 weeks, 6 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-query-optimization</customfieldvalue>
            <customfieldvalue>milena.ivanova@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i37r2n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i2pgxg:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i37d7z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>