<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 06:02:57 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-65527] Use range bounded cursors in place of prefix search_near on unique indexes</title>
                <link>https://jira.mongodb.org/browse/SERVER-65527</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Unique index duplicate key check searches for the key&apos;s existence before insertion into the index. &lt;tt&gt;search_near&lt;/tt&gt; cursor API is used for the purpose. With durable history, the search through a table with a lot of deleted content became more expensive hence causing regression with insertion into the unique indexes. A prefix-search API was added that would cause &lt;tt&gt;search_near&lt;/tt&gt; to early exit if the search was past the prefix. The prefix-search is being replaced by a feature to configure bounds on a cursor.&lt;/p&gt;

&lt;p&gt;An existing prefix search can be replaced with a cursor bound on the prefix. This ticket will do so for the above-mentioned unique index usage in the server.&lt;/p&gt;
</description>
                <environment></environment>
        <key id="2023847">SERVER-65527</key>
            <summary>Use range bounded cursors in place of prefix search_near on unique indexes</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="gregory.wlodarek@mongodb.com">Gregory Wlodarek</assignee>
                                    <reporter username="deepti.hasija@mongodb.com">Deepti Hasija</reporter>
                        <labels>
                    </labels>
                <created>Wed, 13 Apr 2022 01:10:32 +0000</created>
                <updated>Tue, 16 Jan 2024 22:12:08 +0000</updated>
                            <resolved>Wed, 26 Oct 2022 23:20:14 +0000</resolved>
                                                    <fixVersion>6.2.0-rc0</fixVersion>
                                                        <votes>0</votes>
                                    <watches>8</watches>
                                                                                                                <comments>
                            <comment id="4930423" author="xgen-internal-githook" created="Wed, 26 Oct 2022 22:53:49 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;Gregory Wlodarek&apos;, &apos;email&apos;: &apos;gregory.wlodarek@mongodb.com&apos;, &apos;username&apos;: &apos;GWlodarek&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-65527&quot; title=&quot;Use range bounded cursors in place of prefix search_near on unique indexes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-65527&quot;&gt;&lt;del&gt;SERVER-65527&lt;/del&gt;&lt;/a&gt; Use range bounded cursors in place of prefix search_near on unique indexes&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/41efbd3eb4c79a9739b8360dd3acfa9f931fc5da&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/41efbd3eb4c79a9739b8360dd3acfa9f931fc5da&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="4775793" author="sulabh.mahajan" created="Thu, 25 Aug 2022 08:08:44 +0000"  >&lt;p&gt;We have decided to put down this ticket for more time. We want to get feedback from bounded cursors on &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-65528&quot; title=&quot;Use range bounded cursors for restoring index cursors after yielding&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-65528&quot;&gt;&lt;del&gt;SERVER-65528&lt;/del&gt;&lt;/a&gt; first as that seems to be the use case that will directly benefit from bounding the cursors.&lt;/p&gt;</comment>
                            <comment id="4769195" author="jie.chen" created="Tue, 23 Aug 2022 03:22:24 +0000"  >&lt;p&gt;From my understanding, I can give a go at explaining the differences in performance, especially why lower bound have little to no performance increase compared to a regular search near.&lt;/p&gt;


&lt;p&gt;&lt;del&gt;When the user only sets the lower bound, and performs a next(). WiredTiger internally positions the cursor on the lower bound, and does this through performing a search near on the lower bound. In the case that the data range has lots of invisible records between the lower bound and the first visible key, the bounded cursor functionality has would not early exit in this case because there are no upper bounds set. Thus doing so, it can potentially act exactly how a regular search near would work and explain the similar performance as a regular search near.&lt;/del&gt;&lt;/p&gt;

&lt;p&gt;&lt;del&gt;In the case that we do set the lower bound and upper bound, and if WT performs a next() it would internally perform search near the cursor on the lower, but be able to early exit on the upper bound. Which can possibly explain the big difference in performance here.&lt;/del&gt;&lt;/p&gt;

&lt;p&gt;&lt;del&gt;It would be interesting to check the performance of only setting the upper bound and performing a prev(). I would imagine it would behave the same since it also does a search near in this case. Something more interesting would be setting the upper bound and performing a search near as well.&lt;/del&gt;&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Edit:&lt;/b&gt; When a lower bound is set, WT would internally call cursor_row_search() which attempts to position the cursor closest to the key (irrelevant of visibility). Furthermore the range cursor will check the visibility of the key. If the current position is not visible, we will perform a next() to find a visible key. My assumption is that the next() is taking a long time The range cursor project will look into this.&lt;/p&gt;</comment>
                            <comment id="4767264" author="sulabh.mahajan" created="Mon, 22 Aug 2022 14:54:45 +0000"  >&lt;p&gt;I scaled the repro up by a factor of 10 to get more input into performance differences, here is the time taken by the unique index updates:&lt;/p&gt;

&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;Version&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;Time taken (ms)&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;Master&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4609&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;Master without prefix search&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;332583&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;Lower-and-upper-bounds-search-near&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;5161&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;Lower-and-upper-bounds-next&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;5043&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;Lower-bound-next&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;304351&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;I suspect that the versions where we set both the bounds would have been faster than the master if we would not been making a separate copy for the upper bound (prefix key&apos;s last byte incremented).&lt;br/&gt;
&lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;https://jira.mongodb.org/secure/attachment/397190/397190_master-search-next.png&quot; width=&quot;100%&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt;&lt;br/&gt;
next = both upper and lower bounds then next() call to find the record&lt;br/&gt;
search-near = both upper and lower bounds then search_near() call to find the record&lt;br/&gt;
master = prefix search&lt;/p&gt;


&lt;p&gt;The part that is unexpected is the version where we set the lower bound and follow with a next (I think it is the same for search-near) has to skip through a lot of records for some reason. This version is almost as inefficient as not having any bounds or prefix search:&lt;br/&gt;
&lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;https://jira.mongodb.org/secure/attachment/397192/397192_lower-bound-inefficient.png&quot; width=&quot;100%&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;/p&gt;

&lt;p&gt;lower-bound-next = Lower bound only then next() call to find the record&lt;br/&gt;
master-prefix-off = search-near with no bounds and no prefix search.&lt;/p&gt;</comment>
                            <comment id="4746255" author="sulabh.mahajan" created="Fri, 12 Aug 2022 09:04:05 +0000"  >&lt;p&gt;Here is the update from the work so far. I have tried to follow 2/3 approaches:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Bound the cursor to &lt;tt&gt;lower=prefix-key,inclusive&lt;/tt&gt; and to &lt;tt&gt;upper=prefix-key-last-byte-incremented,exclusive&lt;/tt&gt;. Do a &lt;tt&gt;search_near&lt;/tt&gt; as before.&lt;/li&gt;
	&lt;li&gt;Bound the cursor to &lt;tt&gt;lower=prefix-key,inclusive&lt;/tt&gt; and put no upper bound. Instead of a &lt;tt&gt;search_near&lt;/tt&gt;, do a next and let the cursor bound only return keys that match the prefix.&lt;/li&gt;
	&lt;li&gt;Bound the cursor to &lt;tt&gt;lower=prefix-key,inclusive&lt;/tt&gt; and to &lt;tt&gt;upper=prefix-key-last-byte-incremented,exclusive&lt;/tt&gt;. Instead of a &lt;tt&gt;search_near&lt;/tt&gt;, do a next and let the cursor bound only return keys that match the prefix.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I have run MongoDB patch tests and all approaches seem to work and not produce any immediately obvious bugs. I am still using the repro script from &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-56509&quot; title=&quot;Wrap unique index insertion _keyExists call in a WT cursor reconfigure.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-56509&quot;&gt;&lt;del&gt;SERVER-56509&lt;/del&gt;&lt;/a&gt; to test the performance locally. All approaches improve the performance compared to not having any sort of prefix search. Having an upper bound seems to be necessary for now - even when doing &lt;tt&gt;next&lt;/tt&gt; instead of &lt;tt&gt;search_near&lt;/tt&gt;. I am not sure why, I expected we wont be needing upper bound if we use &lt;tt&gt;next&lt;/tt&gt; call. The best performance comes from having both the bounds and doing a &lt;tt&gt;next&lt;/tt&gt; instead of &lt;tt&gt;search_near&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;The next steps:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Understand the results - why not setting an upper bound and doing a next() is not as performant as setting an upper bound.&lt;/li&gt;
	&lt;li&gt;Run more conclusive performance testing&lt;/li&gt;
	&lt;li&gt;Discuss the results with the execution and WiredTiger team to decide which approach to follow. A similar approach could be used in other places in the code, so a discussion would help to generalise the approach.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="4660220" author="JIRAUSER1265133" created="Wed, 6 Jul 2022 14:03:35 +0000"  >&lt;p&gt;Depends on &lt;a href=&quot;https://jira.mongodb.org/browse/WT-9324&quot; title=&quot;Add additional logic to cursor-&amp;gt;search_near to handle range bounds.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;WT-9324&quot;&gt;&lt;del&gt;WT-9324&lt;/del&gt;&lt;/a&gt; which makes search_near use the cursor bounds. &lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                            <outwardlinks description="depends on">
                                        <issuelink>
            <issuekey id="2046957">WT-9324</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="1915961">SERVER-61185</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="2101267">SERVER-68380</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="397192" name="lower-bound-inefficient.png" size="87470" author="sulabh.mahajan@mongodb.com" created="Mon, 22 Aug 2022 14:53:57 +0000"/>
                            <attachment id="397190" name="master-search-next.png" size="100100" author="sulabh.mahajan@mongodb.com" created="Mon, 22 Aug 2022 14:47:29 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 6 Jul 2022 14:03:35 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        1 year, 15 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[<s><a href='https://jira.mongodb.org/browse/WT-9324'>WT-9324</a></s>]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16941"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-2317</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>louis.williams@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            1 year, 15 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>deepti.hasija@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>gregory.wlodarek@mongodb.com</customfieldvalue>
            <customfieldvalue>jie.chen@mongodb.com</customfieldvalue>
            <customfieldvalue>sulabh.mahajan@mongodb.com</customfieldvalue>
            <customfieldvalue>yujin.kang@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0qzs7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i0cw0o:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_22250" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Special Downgrade Instructions Required</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="23343"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="6173">Execution Team 2022-08-08</customfieldvalue>
    <customfieldvalue id="6328">Execution Team 2022-08-22</customfieldvalue>
    <customfieldvalue id="6331">Execution Team 2022-10-03</customfieldvalue>
    <customfieldvalue id="6635">Storage Engines - 2022-10-31</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0qlxj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>