<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:02:27 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-43167] simplify update and delete replication oplog entry application</title>
                <link>https://jira.mongodb.org/browse/SERVER-43167</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Today, applyOperation_inlock() builds and executes an entire Query plan to do each replicated &apos;u&apos; update and &apos;d&apos; delete operation. This method has a lot of overhead that could be avoided. I believe we can avoid the query code entirely for simplification and performance improvements.&lt;br/&gt;
For example, instead of calling deleteObjects(), the delete code could instead:&lt;br/&gt;
1. Find the RecordId of the document to be deleted by doing a point lookup for the _id field in the _id index (use IndexAccessMethod::findSingle()).&lt;br/&gt;
2. Use that RecordId in a call to Collection::deleteDocument().&lt;/p&gt;

&lt;p&gt;For update operations, a similar (but a bit more involved) method could be used.&lt;/p&gt;
</description>
                <environment></environment>
        <key id="917218">SERVER-43167</key>
            <summary>simplify update and delete replication oplog entry application</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="10038" iconUrl="https://jira.mongodb.org/images/icons/subtask.gif" description="">Backlog</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="backlog-server-repl">Backlog - Replication Team</assignee>
                                    <reporter username="milkie@mongodb.com">Eric Milkie</reporter>
                        <labels>
                    </labels>
                <created>Wed, 4 Sep 2019 21:05:00 +0000</created>
                <updated>Wed, 10 May 2023 21:57:57 +0000</updated>
                                                                            <component>Replication</component>
                                        <votes>1</votes>
                                    <watches>11</watches>
                                                                                                                <comments>
                            <comment id="3027772" author="judah.schvimer" created="Mon, 6 Apr 2020 17:21:05 +0000"  >&lt;p&gt;Putting on the backlog until we prioritize this area for performance optimization. We would expect to have to do profiling to see what the bottlenecks are.&lt;/p&gt;</comment>
                            <comment id="3019325" author="tess.avitabile" created="Tue, 31 Mar 2020 15:45:57 +0000"  >&lt;p&gt;It sounds good to do a POC for deletes and check&#160;&lt;a href=&quot;https://evergreen.mongodb.com/task/sys_perf_linux_3_node_replSet_crud_workloads_majority_4bb2ad4c48c07d267c98f5443e0984a5e1ef7209_20_01_27_14_06_08&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this workload&lt;/a&gt;. Although we might need a long-running test to see the performance benefit, I don&apos;t want to invest in that now. If we see a large benefit, we can proceed with the work for deletes. If not, I wouldn&apos;t want to proceed with the work. I expect the change to make the code more complex, since the query layer is a useful abstraction. And if this isn&apos;t low-hanging fruit, then I agree with Judah that we should wait until we want to improve oplog application performance in general, and then profile the entire path.&lt;/p&gt;</comment>
                            <comment id="3013923" author="judah.schvimer" created="Mon, 30 Mar 2020 21:08:44 +0000"  >&lt;p&gt;Yes. I mean the POC. Per offline discussion, I am proposing that we do this POC when we want to spend time optimizing the oplog application path, and that we profile the entire path to see if the CRUD op&apos;s use of the query system is the best area to spend time optimizing. That said, having deletes skip the query system seems small and straightforward enough that we could just do that without extra profiling.&lt;/p&gt;</comment>
                            <comment id="3013834" author="milkie" created="Mon, 30 Mar 2020 20:32:22 +0000"  >&lt;p&gt;By &quot;this work&quot; do you mean &quot;do a POC and run a profiler to measure the performance gain for delete&quot; as Siyuan suggested for next steps?  I don&apos;t see how you could demonstrate that a workload shows this to be a bottleneck without doing the POC.&lt;/p&gt;</comment>
                            <comment id="3013793" author="judah.schvimer" created="Mon, 30 Mar 2020 20:20:31 +0000"  >&lt;p&gt;Do we think this will also simplify the code? If not, I think we should only schedule this work if we have a workload where this is shown to be the bottleneck. &lt;/p&gt;</comment>
                            <comment id="3013770" author="siyuan.zhou@10gen.com" created="Mon, 30 Mar 2020 20:11:34 +0000"  >&lt;p&gt;Talked with &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=milkie&quot; class=&quot;user-hover&quot; rel=&quot;milkie&quot;&gt;milkie&lt;/a&gt;. Eric did a POC in 3.2 and noticed the overhead of code complexity even though id hack is used, but he doesn&apos;t have the patch any more. We don&apos;t have existing perf results either. We should do a POC and probably run a profiler to measure the performance gain for delete.&lt;/p&gt;

&lt;p&gt;Some caveats:&lt;br/&gt;
1. autoIndexId: false cannot use _id index and will have to do a table scan. We need to investigate if it&apos;s still supported and consider ban it in replset. &lt;br/&gt;
2. Extra memory allocation cause by the overhead in query layer could be another source of performance drop in the long run. The recent improvement on key string memory allocation is an example. It can only be tested in a long-running test.&lt;/p&gt;</comment>
                            <comment id="2767915" author="tess.avitabile" created="Tue, 28 Jan 2020 14:24:51 +0000"  >&lt;p&gt;Sounds good, thank you.&lt;/p&gt;

&lt;p&gt;I think it would be a good idea to just start with deletes and look for a performance change. I&apos;d recommend running a perf patch build and checking&#160;&lt;a href=&quot;https://evergreen.mongodb.com/task/sys_perf_linux_3_node_replSet_crud_workloads_majority_4bb2ad4c48c07d267c98f5443e0984a5e1ef7209_20_01_27_14_06_08&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this workload&lt;/a&gt;, which runs CRUD operations with w:majority against a 3-node replica set. If we see a performance benefit, we can proceed with this work.&lt;/p&gt;

&lt;p&gt;I&apos;m not sure we should make this change for updates at all. For $-modifier updates, we need the &lt;tt&gt;UpdateStage&lt;/tt&gt; machinery. For replacement-style updates, we could go directly to the storage engine, but then we would miss the validation done by &lt;tt&gt;UpdateStage&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;I&apos;m also not totally sure we will see a performance benefit. Since this is an _id query, the query system skips forming a canonical query and constructs and id-hack plan. So it would be valuable to do a POC for delete and check for a performance change.&lt;/p&gt;</comment>
                            <comment id="2766892" author="siyuan.zhou@10gen.com" created="Mon, 27 Jan 2020 22:36:49 +0000"  >&lt;p&gt;Agreed with Judah. I don&apos;t have particular concerns as long as one of us is on the code review.&lt;/p&gt;</comment>
                            <comment id="2766567" author="judah.schvimer" created="Mon, 27 Jan 2020 20:37:40 +0000"  >&lt;p&gt;Historically upsert logic, interactions with the &lt;tt&gt;applyOps&lt;/tt&gt; command, and vectored inserts have been difficult to get right. This also may intersect with &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-21700&quot; title=&quot;Do not relax constraints during steady state replication&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-21700&quot;&gt;&lt;del&gt;SERVER-21700&lt;/del&gt;&lt;/a&gt;, so we should be careful of that. &lt;/p&gt;</comment>
                            <comment id="2766527" author="tess.avitabile" created="Mon, 27 Jan 2020 20:22:52 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=judah.schvimer&quot; class=&quot;user-hover&quot; rel=&quot;judah.schvimer&quot;&gt;judah.schvimer&lt;/a&gt; and &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=siyuan.zhou&quot; class=&quot;user-hover&quot; rel=&quot;siyuan.zhou&quot;&gt;siyuan.zhou&lt;/a&gt;, you mentioned that there might be hidden challenges to this ticket. One challenge that I can think of is autoIndexId:false (though I&apos;m not sure whether we allow upgrade with an autoIndexId:false collection). Are there other challenges that come to mind?&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25128"><![CDATA[Replication]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 27 Jan 2020 20:22:52 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        3 years, 44 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>max.hirschhorn@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            3 years, 44 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-repl</customfieldvalue>
            <customfieldvalue>milkie@mongodb.com</customfieldvalue>
            <customfieldvalue>judah.schvimer@mongodb.com</customfieldvalue>
            <customfieldvalue>siyuan.zhou@mongodb.com</customfieldvalue>
            <customfieldvalue>tess.avitabile@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hvol7j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr8if3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="3766">Repl 2020-03-23</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hvo7gv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>