<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:23:11 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-10443] Compact command with LOW priority</title>
                <link>https://jira.mongodb.org/browse/SERVER-10443</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;The &apos;compact&apos; command should be runnable at low priority on any system without affecting existing performance. &lt;/p&gt;

&lt;p&gt;Current &apos;compact&apos; command restrictions:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;only runs replicas by default, unless forced;&lt;/li&gt;
	&lt;li&gt;locks entire collection, preventing any activity until finished.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;This case covers the trivial and straightforward case of regular  maintenance.  That is, the compact command should be runnable at any time, or set up as a background process.  It should scan through all chunks, one at a time.  The least recently used chunk should be addressed first.  &lt;/p&gt;

&lt;p&gt;Processing:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The least recently used chunk should be mapped in and paged in to a special holding area, and actions performed on this chunk should be -&lt;/li&gt;
	&lt;li&gt;all documents sorted in primary/shard-key order;&lt;/li&gt;
	&lt;li&gt;documents snugged up against one another or spaced out, per requirements specified in compact command&apos;s padding factors;&lt;/li&gt;
	&lt;li&gt;OPTIONAL:  As a further step, if any of this document&apos;s fields are indexed, those index entries should be verified and possibly corrected.&lt;/li&gt;
	&lt;li&gt;Processed chunk identifiers should be stored so the same chunks are not repeatedly processed in any n-hour period.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;This could be done without affecting performance by querying the locking percentage on the individual shards once per minute, and if the lock percentage is higher than 80%, pause compaction I/O.  Alternatively, looking up IOSTAT&apos;s IO utilitization percentage will give an idea if the chunk&apos;s location can handle more IO.&lt;/p&gt;

&lt;p&gt;An additional parameter could be a rate limit on the number of chunks processed per minute/hour, or a max percentage of available IO to use.&lt;/p&gt;</description>
                <environment></environment>
        <key id="84879">SERVER-10443</key>
            <summary>Compact command with LOW priority</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="backlog-server-execution">Backlog - Storage Execution Team</assignee>
                                    <reporter username="justanyone">Kevin J. Rice</reporter>
                        <labels>
                            <label>compaction</label>
                            <label>indexing</label>
                            <label>performance</label>
                            <label>sharding</label>
                            <label>storage</label>
                    </labels>
                <created>Tue, 6 Aug 2013 16:28:03 +0000</created>
                <updated>Tue, 6 Dec 2022 05:18:58 +0000</updated>
                            <resolved>Mon, 4 Mar 2019 21:53:09 +0000</resolved>
                                    <version>2.4.3</version>
                                                    <component>Storage</component>
                    <component>Usability</component>
                                        <votes>2</votes>
                                    <watches>9</watches>
                                                                                                                <comments>
                            <comment id="401176" author="justanyone" created="Mon, 12 Aug 2013 17:52:07 +0000"  >&lt;p&gt;I would like to revise this case slightly. &lt;/p&gt;

&lt;p&gt;Base problem(s) to solve:  &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;We cannot compact a database that&apos;s in use without the somewhat-major action of stepping down the primaries.&lt;/li&gt;
	&lt;li&gt;We cannot run compaction when there is a large constant load on Mongo, either.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Optional work associated with this case: data verification.&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Verify that documents are sorted into shard-key order within each chunk, so as to speed disk IO and reduce IOPS for batch jobs that update each document in shard-key order.&lt;/li&gt;
	&lt;li&gt;Verify indexes contain correct references to documents.  Might be too hard to coordinate this activity and not be worth it, thus it&apos;s optional.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I&apos;d like to modify above comments to say it doesn&apos;t matter how this compaction is done, in what order, as long as it&apos;s done with a low/background priority. &lt;/p&gt;</comment>
                            <comment id="401048" author="justanyone" created="Mon, 12 Aug 2013 15:23:14 +0000"  >&lt;p&gt;Note that the state of this process could be kept in a single variable containing the shard key we&apos;re concerned with.  This state variable should be re-read at the end of processing every chunk, so no restart of mongos/mongod/etc. is required in order to make it work differently.&lt;/p&gt;

&lt;p&gt;OPTIONAL FOLLOW-ON CASE:  Perform Deferred splitChunks.&lt;/p&gt;

&lt;p&gt;If the chunk is oversized, it should be split as it would have been normally.   This is only a consideration because it is possible during high-load situations (e.g., mongorestore) to have a chunk that is larger than the standard 64 MB.  It would normally require the chunk to be written to in order for the split-chunk logic to be called, but if we have the chunk in memory, we might as well do any splitchunks required.&lt;/p&gt;


&lt;p&gt;OPTIONAL FOLLOW-ON CASE:  Like the balancer, this compact command could be turned on and off for a set of time periods (so we could run it during specific overnight hours only).&lt;/p&gt;

</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10320">
                    <name>Documented</name>
                                                                <inwardlinks description="is documented by">
                                        <issuelink>
            <issuekey id="336373">DOCS-9559</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="12188">SERVER-1256</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25136"><![CDATA[Storage Execution]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 19 Aug 2013 01:59:27 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        10 years, 27 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>alexander.golin@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            10 years, 27 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-execution</customfieldvalue>
            <customfieldvalue>justanyone</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrmkev:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr7i1j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7219</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hspddz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>