<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:05:27 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-44250] startIndexBuild oplog write and thread pool scheduling are not serialized between concurrent threads on primaries</title>
                <link>https://jira.mongodb.org/browse/SERVER-44250</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Secondaries serialize all oplog commands, which means that the code in &lt;tt&gt;startIndexBuild&lt;/tt&gt;&#160; to 1) &lt;a href=&quot;https://github.com/mongodb/mongo/blob/71add1d5fbb07c95df6cde80a79ab203ac451470/src/mongo/db/index_builds_coordinator_mongod.cpp#L98-L99&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;write the &quot;startIndexBuild&quot; oplog entry&lt;/a&gt; and 2) &lt;a href=&quot;https://github.com/mongodb/mongo/blob/71add1d5fbb07c95df6cde80a79ab203ac451470/src/mongo/db/index_builds_coordinator_mongod.cpp#L134-L143&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;schedule the task on the thread pool&lt;/a&gt;&#160;cannot race with other threads doing the same thing.&lt;/p&gt;

&lt;p&gt;On primares, however, these two operations are not protected from being concurrent, so it would be possible to have two concurrent threads interleave. This leads to a situation described below where the thread pool size is only 1:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Start and replicate a &quot;startIndexBuild&quot; oplog entry for &lt;b&gt;index A&lt;/b&gt;
	&lt;ul&gt;
		&lt;li&gt;The &lt;b&gt;secondary starts building index A&lt;/b&gt;&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Start and replicate a &quot;startIndexBuild&quot; oplog entry for &lt;b&gt;index B&lt;/b&gt;&lt;/li&gt;
	&lt;li&gt;Schedule index &lt;b&gt;build B&lt;/b&gt; on the thread pool on the primary
	&lt;ul&gt;
		&lt;li&gt;The &lt;b&gt;primary&lt;/b&gt; &lt;b&gt;starts building index B&lt;/b&gt;&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Queue up index &lt;b&gt;build B&lt;/b&gt;&#160;on the primary because all threads are in use, and block.&lt;/li&gt;
	&lt;li&gt;Commit and replicate &quot;commitIndexBuild&quot; for &lt;b&gt;index B&lt;/b&gt;&lt;/li&gt;
	&lt;li&gt;The secondary attempts to apply this oplog entry and blocks because index B has not started
	&lt;ul&gt;
		&lt;li&gt;&lt;b&gt;Index B cannot start until index A commits&lt;/b&gt;&lt;/li&gt;
		&lt;li&gt;&lt;b&gt;Index A cannot commit until it replicates the commitIndexBuild oplog entry&lt;/b&gt;, leading to a deadlock scenario.&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&#160;&lt;/p&gt;
&lt;h4&gt;&lt;a name=&quot;Thefollowingoriginaldescriptiondoesnotaccuratelydescribethefullproblem%3A&quot;&gt;&lt;/a&gt;The following original description does not accurately describe the full problem:&lt;/h4&gt;

&lt;p&gt;We &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b9c13fa15b7e58add2c8618b77ca86431cf24408/src/mongo/db/index_builds_coordinator_mongod.cpp#L62&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;limit the maximum number of index build worker threads to 10&lt;/a&gt;, but there is no high-level restriction on the number of active index build threads.&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;When a task is scheduled, it is first added to &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b9c13fa15b7e58add2c8618b77ca86431cf24408/src/mongo/util/concurrency/thread_pool.cpp#L210&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;the queue of _pendingTasks&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;If an index build is scheduled and the maximum number of workers is already active, a new thread is not scheduled and &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b9c13fa15b7e58add2c8618b77ca86431cf24408/src/mongo/util/concurrency/thread_pool.cpp#L366-L369&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;the task is left in the queue&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;This is problematic for secondaries in the following scenario:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Start, but do not commit 10 index builds on the primary, replicating 10 &quot;startIndexBuild&quot; oplog entries and starting 10 worker threads.&lt;/li&gt;
	&lt;li&gt;Start and commit an 11th index build on the primary, replicating a &quot;startIndexBuild&quot; and &quot;commitIndexBuild&quot; oplog entry.
	&lt;ul&gt;
		&lt;li&gt;Because there are already 10 index builds active on the secondary, this index build will queue up in &quot;_pendingTasks&quot;, but it will not start.&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Replication of the &quot;commitIndexBuild&quot; oplog entry will wait for the 11th index build&apos;s thread to join, blocking until it does.
	&lt;ul&gt;
		&lt;li&gt;This in turn blocks other &quot;commitIndexBuild&quot; oplog entries from joining other index build threads, causing this hang.&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;We should do one of the following:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Limit the maximum number of active index builds allowed on the primary
	&lt;ul&gt;
		&lt;li&gt;This should be the same as the maximum number of worker threads. We would enforce this by either returning an error to the user, or just block until resources are avialable. This would prevent the problem on secondaries as long as the limits are identical, otherwise this would not work.&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Do not limit the maximum number of index build worker threads&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="979428">SERVER-44250</key>
            <summary>startIndexBuild oplog write and thread pool scheduling are not serialized between concurrent threads on primaries</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="louis.williams@mongodb.com">Louis Williams</assignee>
                                    <reporter username="louis.williams@mongodb.com">Louis Williams</reporter>
                        <labels>
                    </labels>
                <created>Fri, 25 Oct 2019 19:06:12 +0000</created>
                <updated>Sun, 29 Oct 2023 22:15:42 +0000</updated>
                            <resolved>Wed, 13 Nov 2019 19:22:25 +0000</resolved>
                                                    <fixVersion>4.3.2</fixVersion>
                                                        <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="2539417" author="xgen-internal-githook" created="Wed, 13 Nov 2019 19:06:12 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;username&apos;: &apos;louiswilliams&apos;, &apos;email&apos;: &apos;louis.williams@mongodb.com&apos;, &apos;name&apos;: &apos;Louis Williams&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-44250&quot; title=&quot;startIndexBuild oplog write and thread pool scheduling are not serialized between concurrent threads on primaries&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-44250&quot;&gt;&lt;del&gt;SERVER-44250&lt;/del&gt;&lt;/a&gt; serialize startIndexBuild oplog write and thread pool scheduling between concurrent threads on primaries&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/5074799696cdff95f46b81f054f04b2a55a1e2bc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/5074799696cdff95f46b81f054f04b2a55a1e2bc&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="2539368" author="louis.williams" created="Wed, 13 Nov 2019 18:46:26 +0000"  >&lt;p&gt;We&apos;re going to use a mutex for now to enable test coverage. It behaves correctly, but it depends on thread pool behavior that is subject to change in the future. The plan is to follow-up with &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-44609&quot; title=&quot;Replicate startIndexBuild oplog entry in the same thread as the index build.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-44609&quot;&gt;&lt;del&gt;SERVER-44609&lt;/del&gt;&lt;/a&gt; to implement the solution that does not depend on the thread pool&apos;s internal queueing.&lt;/p&gt;</comment>
                            <comment id="2525761" author="louis.williams" created="Fri, 8 Nov 2019 19:08:31 +0000"  >&lt;p&gt;There are two ways I see of fixing this bug:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Use a mutex to protect index build initialization (i.e. replicating &quot;startIndexBuild&quot;) and thread pool scheduling&lt;/li&gt;
	&lt;li&gt;Move index initialization into the builder thread. It seems like these two were intentionally separated by&#160;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-39369&quot; title=&quot;Filter index builds requests in the coordinator, register the builds on the Coordinator and set them up in the persisted catalog before changing threads&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-39369&quot;&gt;&lt;del&gt;SERVER-39369&lt;/del&gt;&lt;/a&gt;, so we may want to explore if it&apos;s possible to put them back together.&lt;/li&gt;
&lt;/ol&gt;
</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                        <issuelink>
            <issuekey id="943441">SERVER-43692</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="1061884">SERVER-45262</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="2290026">SERVER-74953</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="1001281">SERVER-44609</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 13 Nov 2019 19:06:12 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 13 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16941"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-253</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 13 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_16465" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Linked BF Score</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>13.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>louis.williams@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hvyz93:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr65an:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="3345">Execution Team 2019-11-18</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hvylif:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>