<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 06:41:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-79785] WaitForMajorityService can be a bottleneck for two phase commit transactions</title>
                <link>https://jira.mongodb.org/browse/SERVER-79785</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;While performance testing for &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-79056&quot; title=&quot;Measure performance for updateOne without shard key with commit optimizations&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-79056&quot;&gt;&lt;del&gt;SERVER-79056&lt;/del&gt;&lt;/a&gt;, I noticed throughput for transactions that use two phase commit scales poorly with more concurrent transactions, despite CPU and IO utilization staying low and secondaries keeping up. The problem seems to be the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/73e255308e5ea944dfdf967df982274af7b09870/src/mongo/db/repl/wait_for_majority_service.h#L180&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;WaitForMajorityService&lt;/a&gt; used &lt;a href=&quot;https://github.com/mongodb/mongo/blob/73e255308e5ea944dfdf967df982274af7b09870/src/mongo/db/s/transaction_coordinator.cpp#L105-L107&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;by two phase commit coordinators&lt;/a&gt; to wait for the participant list and decision writes to majority replicate can&apos;t keep up with many requests to wait for majority.&lt;/p&gt;

&lt;p&gt;When I switch transaction coordinators to either wait for majority write concern as part of the writes themselves (which synchronously blocks a task executor thread) or wait asynchronously using &lt;a href=&quot;https://github.com/10gen/mongo/blob/f1f16c7b89cdb4653a94ec5636aa6baed856f5ab/src/mongo/db/repl/replication_coordinator_impl.cpp#L2175&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ReplicationCoordinator::awaitReplicationAsyncNoWTimeout&lt;/a&gt;, throughput with the same workload goes up significantly (over 4x with my setup) and CPU becomes the bottleneck. I initially saw this in the &lt;a href=&quot;https://github.com/10gen/dsi/blob/5a6d9d3109ec45c3ec2febd922e708f34e8f7914/configurations/mongodb_setup/mongodb_setup.shard.yml#L4&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;shard&lt;/a&gt; DSI workload with custom 0.3ms network delay, which uses 3 node replica sets, but I reproduced it in a modified shard workload with single node replica sets.&lt;/p&gt;

&lt;p&gt;The problem with the WaitForMajorityService seems to be that it &lt;a href=&quot;https://github.com/mongodb/mongo/blob/1aa6cc2c2ef07dfd1daefb0c4aea8be382291788/src/mongo/db/repl/wait_for_majority_service.cpp#L273&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;waits for only the lowest opTime it&apos;s been given&lt;/a&gt; in each loop of _periodicallyWaitForMajority(), so if it receives new opTimes faster than it can wait for them, requests queue up and latency increases significantly. I modified the service to get the latest committed snapshot opTime after waiting for majority and pretend that was the most recently waited for time if it is greater than the actually waited on time (using &lt;a href=&quot;https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/replication_coordinator.h#L999&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ReplicationCoordinator::getCurrentCommittedSnapshotOpTime&lt;/a&gt;), and that seemed to resolve the bottleneck as well.&lt;/p&gt;</description>
                <environment></environment>
        <key id="2410646">SERVER-79785</key>
            <summary>WaitForMajorityService can be a bottleneck for two phase commit transactions</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="backlog-server-sharding-nyc">[DO NOT USE] Backlog - Sharding NYC</assignee>
                                    <reporter username="jack.mulrow@mongodb.com">Jack Mulrow</reporter>
                        <labels>
                            <label>sharding-nyc-subteam3</label>
                    </labels>
                <created>Mon, 7 Aug 2023 14:15:50 +0000</created>
                <updated>Tue, 15 Aug 2023 20:13:17 +0000</updated>
                            <resolved>Tue, 15 Aug 2023 14:27:20 +0000</resolved>
                                                                                        <votes>0</votes>
                                    <watches>6</watches>
                                                                                                                <comments>
                            <comment id="5635009" author="jack.mulrow" created="Tue, 15 Aug 2023 14:27:20 +0000"  >&lt;p&gt;Closing as a duplicate of &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-79881&quot; title=&quot;Integrate WaitForMajorityService with ReplicationCoordinator&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-79881&quot;&gt;SERVER-79881&lt;/a&gt;, since fixing that ticket should resolve the performance problem listed in this one. CC &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=lingzhi.deng%40mongodb.com&quot; class=&quot;user-hover&quot; rel=&quot;lingzhi.deng@mongodb.com&quot;&gt;lingzhi.deng@mongodb.com&lt;/a&gt;, &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=judah.schvimer%40mongodb.com&quot; class=&quot;user-hover&quot; rel=&quot;judah.schvimer@mongodb.com&quot;&gt;judah.schvimer@mongodb.com&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="2413244">SERVER-79881</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="2413244">SERVER-79881</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="467548" name="2PC_locust_results.txt" size="7499" author="jack.mulrow@mongodb.com" created="Mon, 7 Aug 2023 14:52:02 +0000"/>
                            <attachment id="467528" name="mongo_updateone.tar.gz" size="10847" author="jack.mulrow@mongodb.com" created="Mon, 7 Aug 2023 14:21:00 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25134"><![CDATA[Sharding NYC]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 11 Aug 2023 15:37:54 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        25 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>lingzhi.deng@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            25 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-sharding-nyc</customfieldvalue>
            <customfieldvalue>jack.mulrow@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i2l4zj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i237wo:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;ol&gt;
	&lt;li&gt;Start up the &lt;tt&gt;shard&lt;/tt&gt; setup in DSI using &lt;a href=&quot;https://spruce.mongodb.com/version/64ceaf0de3c3310000edbb61/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this binary&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;scp the mongo_updateone locust workload to the workload client and set it up
	&lt;ol&gt;
		&lt;li&gt;Or use any workload with concurrent two phase commit transactions&lt;/li&gt;
	&lt;/ol&gt;
	&lt;/li&gt;
	&lt;li&gt;Disable the custom &quot;canUseSingleWriteCommit&quot; server parameter on each mongos to force the non-targeted single writes from the workload to use two phase commit&lt;/li&gt;
	&lt;li&gt;Disable all new server parameters from the binary to test baseline 2PC performance or enable any to see performance without the bottleneck (by default txnMajorityWaitInReplCoordinator is enabled which uses the async awaitReplication fix)&lt;/li&gt;
&lt;/ol&gt;
</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i2kr4v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>