<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:07:17 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-44888] cleanupOrphaned would take 20 days, reimporting the data would take only 1-2 days</title>
                <link>https://jira.mongodb.org/browse/SERVER-44888</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;I have a sharded cluster, and i inserted 1 billion documents into an unsharded collection.&lt;/p&gt;

&lt;p&gt;I then sharded that collection, and the balancer distributed all the chunks to the other shards. Running a count() on the collection yields a wrong result; the first shard shows ~1 billion documents, and the other two show 333 million each, in total ~1.666 billion documents. I can see the count going down with 200-300 documents each second. This means it would take &amp;gt;20 days to complete the delete, but it would only take 2-3 days to drop the collection and reinsert the data. Is there any way to make this process faster?&lt;/p&gt;

&lt;p&gt;I&apos;m using mongo 4.2.1&lt;/p&gt;</description>
                <environment></environment>
        <key id="1030049">SERVER-44888</key>
            <summary>cleanupOrphaned would take 20 days, reimporting the data would take only 1-2 days</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.mongodb.org/images/icons/priorities/minor.svg">Minor - P4</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="4">Incomplete</resolution>
                                        <assignee username="eric.sedor@mongodb.com">Eric Sedor</assignee>
                                    <reporter username="thestick613">Tudor Aursulesei</reporter>
                        <labels>
                    </labels>
                <created>Sun, 1 Dec 2019 13:52:23 +0000</created>
                <updated>Fri, 6 Dec 2019 22:02:51 +0000</updated>
                            <resolved>Fri, 6 Dec 2019 22:02:26 +0000</resolved>
                                    <version>4.2.1</version>
                                                                        <votes>0</votes>
                                    <watches>9</watches>
                                                                                                                <comments>
                            <comment id="2590769" author="eric.sedor" created="Fri, 6 Dec 2019 22:02:14 +0000"  >&lt;p&gt;Understood &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=thestick613&quot; class=&quot;user-hover&quot; rel=&quot;thestick613&quot;&gt;thestick613&lt;/a&gt;; to really investigate this we would need the diagnostic data, the specific cleanupOrphaned arguments you passed, and the results of sh.status(). I am going to close this for now but I encourage you to comment here or open a new issue if this happens again and you can provide that full set of info.&lt;/p&gt;

&lt;p&gt;But to be clear we do think a rate of 200-300 documents/s is suspicious.&lt;/p&gt;

&lt;p&gt;Thank you very much!&lt;/p&gt;</comment>
                            <comment id="2575527" author="thestick613" created="Mon, 2 Dec 2019 21:07:45 +0000"  >&lt;p&gt;Hello,&lt;/p&gt;

&lt;p&gt;I&apos;ve let the cluster rebalance itself for a while (1-2 days), and because progress was slow, i found out the&#160; cleanupOrphaned command, and i ran it for another 1-2 days. I then created this ticket. I since removed all the data, and i&apos;ve restarted the reimport process from scratch, so no diagnostic.data.&lt;/p&gt;</comment>
                            <comment id="2574836" author="eric.sedor" created="Mon, 2 Dec 2019 17:01:49 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=thestick613&quot; class=&quot;user-hover&quot; rel=&quot;thestick613&quot;&gt;thestick613&lt;/a&gt;, a couple things:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Can you confirm whether or not you have manually run a cleanupOrphaned command? I ask because it sounds like you are describing the activity of the RangeDeleter that removes documents after chunk migration, not the cleanupOrphaned command.&lt;/li&gt;
&lt;/ul&gt;


&lt;ul&gt;
	&lt;li&gt;For the Primary shard that is deleting documents, would you please archive (tar or zip) the &lt;tt&gt;$dbpath/diagnostic.data&lt;/tt&gt; directory (the contents are described &lt;a href=&quot;https://docs.mongodb.com/manual/administration/analyzing-mongodb-performance/#full-time-diagnostic-data-capture&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt;) and attach it to this ticket?&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Gratefully,&lt;br/&gt;
Eric&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 2 Dec 2019 17:01:49 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 9 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>eric.sedor@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 9 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>eric.sedor@mongodb.com</customfieldvalue>
            <customfieldvalue>thestick613</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hw70k7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hvv8un:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[eric.sedor@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hw6mtj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>