<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 08:05:58 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[DOCS-12735] Docs for TOOLS-1956: Add Bulk Upsert and increase batch size limit</title>
                <link>https://jira.mongodb.org/browse/DOCS-12735</link>
                <project id="10380" key="DOCS">Documentation</project>
                    <description>&lt;h2&gt;&lt;a name=&quot;Description&quot;&gt;&lt;/a&gt;Description&lt;/h2&gt;

&lt;h3&gt;&lt;a name=&quot;Description%3A&quot;&gt;&lt;/a&gt;Description: &lt;/h3&gt;

&lt;p&gt;Options documentation would need to be updated&lt;/p&gt;

&lt;h3&gt;&lt;a name=&quot;EngineeringTicketDescription%3A&quot;&gt;&lt;/a&gt;Engineering Ticket Description:&lt;/h3&gt;

&lt;p&gt;&lt;b&gt;Revised&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;As we convert the tools to the new Go driver, the original PR will not apply.  Instead, we&apos;ll implement higher-performance bulk insert/update built on the new Go driver bulk API, including the higher batch size limit.&lt;/p&gt;

&lt;p&gt;The request to add a &quot;Remove&quot; mode has been pulled out to &lt;a href=&quot;https://jira.mongodb.org/browse/TOOLS-2268&quot; title=&quot;Add remove mode to mongoimport&quot; class=&quot;issue-link&quot; data-issue-key=&quot;TOOLS-2268&quot;&gt;&lt;del&gt;TOOLS-2268&lt;/del&gt;&lt;/a&gt; for separate triage&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Original&lt;/b&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The below changes were implemented after consulting with our Mongo rep Anant Srivastava to meet internal implementation needs. I will be opening a pull request shortly with our changes for review in case some/all of these changes want to be rolled into the product.&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;Bulkupserts&quot;&gt;&lt;/a&gt;Bulk upserts&lt;/h3&gt;

&lt;p&gt;Enable bulk upsert operations. In the live version of mongoimport, running in upsert mode limits to 1 insertion worker process and an effective batch size of 1. This results in performance that unfortunately rendered mongoimport not viable for our volumes. With the addition of bulk, multi-worker upserts, we are seeing a 400-700X performance boost. With this performance tweak, mongoimport became a viable tool for our update process.&lt;/p&gt;

&lt;p&gt;--bulkUpdate command line option added. When toggled on, upserts can be executed in bulk and in multiple worker processes. This option was added to limit the impact to existing processes using mongoimport. There is some debate on whether this flag is necessary or if &apos;bulkUpdate&apos; mode should be &apos;on&apos; by default and toggled &apos;off&apos; via the --maintainInsertionOrder option&lt;/p&gt;

&lt;p&gt;The change for &apos;bulkUpdate&apos; upsert mode was implemented through disabling maintainInsertionOrder, removing the restriction for 1 insertion worker and adding new method to BufferedBulkInserter to support bulk Upsert operations.&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;Removemode&quot;&gt;&lt;/a&gt;Remove mode&lt;/h3&gt;

&lt;p&gt;--mode remove option added. Will construct bson selectors using records from input file and --upsertFields to remove matching documents. Each selector will remove only a single matching document. Implemented through adding new method to BufferedBulkInserter to support bulk Remove operations.&lt;/p&gt;

&lt;p&gt;--upsertFields are required when specifying this option.&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;batchSizelimitincreasedfrom1kto100k&quot;&gt;&lt;/a&gt;batchSize limit increased from 1k to 100k&lt;/h3&gt;

&lt;p&gt;With the MongoDB 3.6 batch size limit changes, the --batchSize option&apos;s maximum was raised to 100k documents. Mongoimport and mongo driver code (gopkg.in/mgo.v2) were patched to support this. Specifying a batch size larger than 1000 and targeting MongoDB &amp;lt;3.6 results in operations being batched driver side in chunks of 1000. The driver was also patched to split write operations &amp;gt;16MB into separate writeOpCommand calls for *insertOp, bulkUpdateOp, and bulkDeleteOp operation types.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://docs.mongodb.com/manual/reference/limits/#Write-Command-Batch-Limit-Size&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.com/manual/reference/limits/#Write-Command-Batch-Limit-Size&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;


&lt;h2&gt;&lt;a name=&quot;Scopeofchanges&quot;&gt;&lt;/a&gt;Scope of changes&lt;/h2&gt;

&lt;h2&gt;&lt;a name=&quot;ImpacttoOtherDocs&quot;&gt;&lt;/a&gt;Impact to Other Docs&lt;/h2&gt;

&lt;h2&gt;&lt;a name=&quot;MVP%28WorkandDate%29&quot;&gt;&lt;/a&gt;MVP (Work and Date)&lt;/h2&gt;

&lt;h2&gt;&lt;a name=&quot;Resources%28ScopeorDesignDocs%2CInvision%2Cetc.%29&quot;&gt;&lt;/a&gt;Resources (Scope or Design Docs, Invision, etc.)&lt;/h2&gt;
</description>
                <environment></environment>
        <key id="773884">DOCS-12735</key>
            <summary>Docs for TOOLS-1956: Add Bulk Upsert and increase batch size limit</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="kay.kim@mongodb.com">Kay Kim</assignee>
                                    <reporter username="kay.kim@mongodb.com">Kay Kim</reporter>
                        <labels>
                    </labels>
                <created>Tue, 21 May 2019 21:11:11 +0000</created>
                <updated>Mon, 13 Nov 2023 18:20:50 +0000</updated>
                            <resolved>Wed, 28 Aug 2019 14:48:46 +0000</resolved>
                                                    <fixVersion>4.1.12</fixVersion>
                    <fixVersion>Server_Docs_20231030</fixVersion>
                    <fixVersion>Server_Docs_20231106</fixVersion>
                    <fixVersion>Server_Docs_20231105</fixVersion>
                    <fixVersion>Server_Docs_20231113</fixVersion>
                                    <component>manual</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                                                                <comments>
                            <comment id="2398216" author="xgen-internal-githook" created="Wed, 28 Aug 2019 18:36:57 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;Kay Kim&apos;, &apos;username&apos;: &apos;kay-kim&apos;, &apos;email&apos;: &apos;kay.kim@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/DOCS-12735&quot; title=&quot;Docs for TOOLS-1956: Add Bulk Upsert and increase batch size limit&quot; class=&quot;issue-link&quot; data-issue-key=&quot;DOCS-12735&quot;&gt;&lt;del&gt;DOCS-12735&lt;/del&gt;&lt;/a&gt;: 4.2 mongimport bulk insert/upserts(pt2)&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/docs/commit/5bfb47c9295e916cdb68a157c743fbe74aed99aa&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/docs/commit/5bfb47c9295e916cdb68a157c743fbe74aed99aa&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="2397529" author="xgen-internal-githook" created="Wed, 28 Aug 2019 14:47:16 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;Kay Kim&apos;, &apos;username&apos;: &apos;kay-kim&apos;, &apos;email&apos;: &apos;kay.kim@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/DOCS-12735&quot; title=&quot;Docs for TOOLS-1956: Add Bulk Upsert and increase batch size limit&quot; class=&quot;issue-link&quot; data-issue-key=&quot;DOCS-12735&quot;&gt;&lt;del&gt;DOCS-12735&lt;/del&gt;&lt;/a&gt;: 4.2 mongimport bulk insert/upserts&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/docs/commit/49eebd9d0c36b6b0dfd48f43f3f70655e1cbe032&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/docs/commit/49eebd9d0c36b6b0dfd48f43f3f70655e1cbe032&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10320">
                    <name>Documented</name>
                                            <outwardlinks description="documents">
                                        <issuelink>
            <issuekey id="500910">TOOLS-1956</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 28 Aug 2019 14:47:16 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 24 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>DOCS-11762</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>emet.ozar@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 24 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>kay.kim@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hv0hvz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|huprxb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hv045b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                </customfields>
    </item>
</channel>
</rss>