<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:53:09 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-39800] Investigate the new oplog format impact in linkbench performance tests</title>
                <link>https://jira.mongodb.org/browse/SERVER-39800</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;For the performance testing, we need:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Get the statistics to measure the correct ops/sec as Matthew Saltz before.&lt;/li&gt;
	&lt;li&gt;Change LinkBench to use the new oplog format by setting a server parameter.&lt;/li&gt;
	&lt;li&gt;Get the numbers from existing 1-node replset and 3-node replset tests in a patch build.&lt;/li&gt;
	&lt;li&gt;Update one workload to include 100,000 small updates in a transaction and test its performance with the old and new format.&lt;/li&gt;
	&lt;li&gt;If the performance drops, run LinkBench locally with gperftools enabled to get the profiling to pinpoint the slowdown.&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="703862">SERVER-39800</key>
            <summary>Investigate the new oplog format impact in linkbench performance tests</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="william.schultz@mongodb.com">William Schultz</assignee>
                                    <reporter username="greg.mckeon@mongodb.com">Gregory McKeon</reporter>
                        <labels>
                            <label>bigtxns_perf</label>
                    </labels>
                <created>Mon, 25 Feb 2019 15:34:02 +0000</created>
                <updated>Tue, 23 Apr 2019 22:45:19 +0000</updated>
                            <resolved>Tue, 23 Apr 2019 20:26:35 +0000</resolved>
                                                                    <component>Replication</component>
                                        <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="2222713" author="siyuan.zhou@10gen.com" created="Tue, 23 Apr 2019 22:45:19 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=william.schultz&quot; class=&quot;user-hover&quot; rel=&quot;william.schultz&quot;&gt;william.schultz&lt;/a&gt;, I updated the ticket to reflect the real work we&apos;ve done here. We won&apos;t switch the format earlier than supporting large transactions by default.&lt;/p&gt;</comment>
                            <comment id="2186717" author="william.schultz" created="Wed, 20 Mar 2019 21:20:17 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=siyuan.zhou&quot; class=&quot;user-hover&quot; rel=&quot;siyuan.zhou&quot;&gt;siyuan.zhou&lt;/a&gt; To answer your above questions:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;1. Do we expect any write conflict during the link load phase? They are upsert, which seems to imply update is possible, meaning conflicts.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;No, as I understand it, there should not be any write conflicts during this phase. The link loading phase does occur in parallel, across many threads, but each thread is given a subset of the node id space, and these subsets should be disjoint. Each thread will generate links for the node ids it received, and those links (edges) should go from the given node id to some other node. In other words, only a single thread will ever generate outgoing links for a particular node, and the (from, to) node ids of an edge is embedded in the document&apos;s &lt;tt&gt;_id&lt;/tt&gt; in the linktable. So I do not think there should be any potential for write conflicts. You can see the data schema for the link table &lt;a href=&quot;https://github.com/10gen/linkbench/tree/ac4b42cbdefd88a73dafc19cf9aeb29619c6f050#running-a-benchmark-with-mongodb&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;2. If conflict is possible. Does the latency include any retry? If the latency of one transaction increases a little, retry will exaggerate the total finish time.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;If there were retries on conflicts, yes, the retries should be included in the latency calculation, since the retries happen automatically, inside the link store implementation.&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;3. I&apos;m wondering why &quot;the overall link loading throughput numbers&quot; includes node loading and whether there&apos;s a simple way to split them, since latency and throughput are related, but perhaps not inversely proportionally.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Yeah, it&apos;s a bit odd, but it&apos;s just an artifact of the way linkbench is currently written. You can see &lt;a href=&quot;https://github.com/10gen/linkbench/blob/0c191bffab8c6ace561211df63468ab32e26cf60/src/main/java/com/facebook/LinkBench/LinkBenchDriver.java#L251&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt; that it starts up all loader threads (e.g. 20 link loaders + 1 node loader) at the same time and waits for them all to complete. It measures the length of that process and reports it as the total load phase duration. To accurately measure just the time taken for link loading, we would likely have to force node loading to happen separately (before?) the link loading phase even begins. Technically, I&apos;m not sure if the load phase is traditionally meant to be used as a real benchmark, but I&apos;m not sure.&lt;/p&gt;</comment>
                            <comment id="2186661" author="william.schultz" created="Wed, 20 Mar 2019 20:51:44 +0000"  >&lt;p&gt;To double check my initial results, I ran 3 additional patch builds with the new oplog entry format against 1 node replica sets. The original patch build I referenced above had some extra log messages I added that were being printed out on certain transaction events. I ran new patch builds without these changes just to make sure the results weren&apos;t tainted by those extra log messages. The patch builds are here:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;&lt;a href=&quot;https://evergreen.mongodb.com/task/sys_perf_linux_1_node_replSet_linkbench_patch_ec1e10dc518b36d381faeb228008512f0dee68c4_5c916d29e3c331347ec10052_19_03_19_22_29_07##%257B%2522compare%2522%253A%255B%257B%2522hash%2522%253A%2522ec1e10dc518b36d381faeb228008512f0dee68c4%2522%257D%255D%257D&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Patch 1&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://evergreen.mongodb.com/task/sys_perf_linux_1_node_replSet_linkbench_patch_ec1e10dc518b36d381faeb228008512f0dee68c4_5c916d43e3c331347ec10081_19_03_19_22_29_43##%257B%2522compare%2522%253A%255B%257B%2522hash%2522%253A%2522ec1e10dc518b36d381faeb228008512f0dee68c4%2522%257D%255D%257D&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Patch 2&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://evergreen.mongodb.com/task/sys_perf_linux_1_node_replSet_linkbench_patch_ec1e10dc518b36d381faeb228008512f0dee68c4_5c916d4be3c331347ec10092_19_03_19_22_29_52##%257B%2522compare%2522%253A%255B%257B%2522hash%2522%253A%2522ec1e10dc518b36d381faeb228008512f0dee68c4%2522%257D%255D%257D&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Patch 3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The regressions in the load phase still appear. &lt;/p&gt;

&lt;p&gt;To expand on the above analysis, it does seem like throughput is affected even for the operation types that only do 1-2 operations per transaction. If we look at the &lt;a href=&quot;https://evergreen.mongodb.com/task/sys_perf_linux_1_node_replSet_linkbench_patch_ec1e10dc518b36d381faeb228008512f0dee68c4_5c916d29e3c331347ec10052_19_03_19_22_29_07##%257B%2522compare%2522%253A%255B%257B%2522hash%2522%253A%2522ec1e10dc518b36d381faeb228008512f0dee68c4%2522%257D%255D%257D&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;first patch build&lt;/a&gt;, for the ADD_LINK metric, we can see a small regression. Confusingly, these request phase statistics &lt;em&gt;are&lt;/em&gt; reported correctly as throughput (ops/sec). Thus, a decrease indicates things got slower. The ADD_LINK throughput number on that patch is reported as 253 ops/sec. One decent way to get a sense of the overall comparison between the patch build run and previous commits is to select many other prior commits and click &quot;Compare&quot;, on the trend graph. This will show the performance change in percentage of the currently selected commit (the one you are hovered over) against a bunch of other commits. If we compare the patch build change against numerous prior commits, the regression appears to be, on average, around 7%. I could post all the numbers here to be more precise but I feel it is easier to take advantage of the existing performance visualization tools we have on the Evergreen page. In summary, though, it does appear that there might be a bit of a regression even for small transactions.&lt;/p&gt;</comment>
                            <comment id="2177987" author="siyuan.zhou@10gen.com" created="Tue, 12 Mar 2019 05:33:14 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=william.schultz&quot; class=&quot;user-hover&quot; rel=&quot;william.schultz&quot;&gt;william.schultz&lt;/a&gt;, the goal of this ticket is to compare the performance of both formats. I think both small transactions and large but less than 16MB transactions are needed to support our decision. LinkBench represents small transactions well.&lt;/p&gt;

&lt;p&gt;Good point on secondary application perf test! I care more about the end-to-end performance though, since I suspect the overhead of the new format isn&apos;t critical in the code path of w: majority writes.&lt;/p&gt;</comment>
                            <comment id="2176807" author="william.schultz" created="Mon, 11 Mar 2019 12:06:29 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=siyuan.zhou&quot; class=&quot;user-hover&quot; rel=&quot;siyuan.zhou&quot;&gt;siyuan.zhou&lt;/a&gt; Is an explicit goal of this ticket to enable the new transaction oplog format in Linkbench or is it to simply compare the performance of the new format against the old &quot;applyOps&quot; format? It seems that one hypothesis we would like to test is that secondary oplog application performance of the new transaction oplog format is worse than with the old format. It seems most reasonable, then, to test transactions that have ~16MB of data, but have a large number of operations, since we expect this is the case where the worst performance hits would occur. (e.g. as you said, a 16MB transaction with 100,000 operations). I am wondering if a more targeted performance test would be better to answer that question, since Linkbench seems more about trying to model a real world workload scenario. Perhaps modifying this &lt;a href=&quot;https://github.com/10gen/workloads/blob/81d662d4c79c21a0c6c08fe2e4d1cee087c6790d/workloads/secondary_performance.js&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;secondary application performance test&lt;/a&gt; to run transactions would be useful for our purposes here. Measuring &lt;tt&gt;w:majority&lt;/tt&gt; write throughput should also be a decent proxy for secondary application performance, but it feels better to measure it directly if possible. What are your thoughts?&lt;/p&gt;
</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                        <issuelink>
            <issuekey id="703864">SERVER-39802</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 11 Mar 2019 12:06:29 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 42 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-1035</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>siyuan.zhou@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 42 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>greg.mckeon@mongodb.com</customfieldvalue>
            <customfieldvalue>siyuan.zhou@mongodb.com</customfieldvalue>
            <customfieldvalue>william.schultz@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|huoljz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|huecan:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="2823">Repl 2019-03-25</customfieldvalue>
    <customfieldvalue id="2896">Repl 2019-04-08</customfieldvalue>
    <customfieldvalue id="2918">Repl 2019-04-22</customfieldvalue>
    <customfieldvalue id="2919">Repl 2019-05-06</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|huo7tb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>