<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 02:57:19 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-1545] make single command for size and median</title>
                <link>https://jira.mongodb.org/browse/SERVER-1545</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Want to have 1 command &quot;shouldSplitAndMedian??&quot; that figures out if we should split, and if so the split point.&lt;br/&gt;
While doing, also want to change way we determine if we should split.&lt;/p&gt;

&lt;p&gt;Instead of actually counting data, just want to walk index and assume each object is the average object size.&lt;br/&gt;
Will make much faster, and also not require all the data to fit in ram.&lt;/p&gt;

&lt;p&gt;Also - should make it yield as well, just in case it has to page in index.&lt;/p&gt;</description>
                <environment></environment>
        <key id="12818">SERVER-1545</key>
            <summary>make single command for size and median</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="alerner">Alberto Lerner</assignee>
                                    <reporter username="eliot">Eliot Horowitz</reporter>
                        <labels>
                    </labels>
                <created>Tue, 17 Aug 2010 21:20:44 +0000</created>
                <updated>Mon, 5 Jun 2017 22:41:36 +0000</updated>
                            <resolved>Tue, 14 Sep 2010 19:05:30 +0000</resolved>
                                                    <fixVersion>1.7.0</fixVersion>
                                    <component>Sharding</component>
                                        <votes>2</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="18165" author="auto" created="Tue, 14 Sep 2010 19:07:44 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;login&apos;: &apos;erh&apos;, &apos;name&apos;: &apos;Eliot Horowitz&apos;, &apos;email&apos;: &apos;eliot@10gen.com&apos;}
&lt;p&gt;Message: dataSize has an estimate option, chunk uses this &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-1545&quot; title=&quot;make single command for size and median&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-1545&quot;&gt;&lt;del&gt;SERVER-1545&lt;/del&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://github.com/mongodb/mongo/commit/d5793fa9a8281b349e0fc1ddd09acc9d1a084055&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://github.com/mongodb/mongo/commit/d5793fa9a8281b349e0fc1ddd09acc9d1a084055&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="17750" author="alerner" created="Fri, 3 Sep 2010 14:36:21 +0000"  >&lt;p&gt;Right now, we still need to rely on the dataSize and medianKey commands. The first makes the decision to split; the latter picks where to split. This ticket made dataSize much faster because it now uses an estimated chunk size rather than computing it through scanning the mapped files.&lt;/p&gt;

&lt;p&gt;The attempt to use that estimated size and create a single command &amp;#8211; this command is in fact splitVector &amp;#8211; failed. The datasize varies according with the extents size in a datafile, which grows in increasing strides. Computing split points by assuming each object is datasize/numRecs was very imprecise and led to irregular chunk sizes.&lt;/p&gt;

&lt;p&gt;We have ways to make split even faster by keeping a statistical summary of the keys per chunk. That would increase speed further. But our  testing results now showed the estimated datasize gave already  excellent results.&lt;/p&gt;</comment>
                            <comment id="17749" author="auto" created="Fri, 3 Sep 2010 14:22:01 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;login&apos;: &apos;alerner&apos;, &apos;name&apos;: &apos;Alberto Lerner&apos;, &apos;email&apos;: &apos;alerner@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-1545&quot; title=&quot;make single command for size and median&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-1545&quot;&gt;&lt;del&gt;SERVER-1545&lt;/del&gt;&lt;/a&gt; don&apos;t switch shards if current is best (FCoJ)&lt;br/&gt;
&lt;a href=&quot;http://github.com/mongodb/mongo/commit/8b24b1e719fbfe4c7ace9e42f6535fa7fd948462&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://github.com/mongodb/mongo/commit/8b24b1e719fbfe4c7ace9e42f6535fa7fd948462&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In an insertion to an empty collection, we could see the auto-splitting code switching shards right in the second chunk. That would leave the first chunk in, say, shard0 and the following ones in shard1. &lt;/p&gt;

&lt;p&gt;The reason that happened is that Shard::Pick() assumed the best shard what the first one it got from the config DB. If the current one was not first, and was a tie with it, Pick() would switch.&lt;/p&gt;</comment>
                            <comment id="17621" author="auto" created="Tue, 31 Aug 2010 16:06:22 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;login&apos;: &apos;alerner&apos;, &apos;name&apos;: &apos;Alberto Lerner&apos;, &apos;email&apos;: &apos;alerner@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-1545&quot; title=&quot;make single command for size and median&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-1545&quot;&gt;&lt;del&gt;SERVER-1545&lt;/del&gt;&lt;/a&gt; when moving a chunk, we dont want to risk getting the same shard&lt;br/&gt;
&lt;a href=&quot;http://github.com/mongodb/mongo/commit/ee65b0cc858ffc1ef68ab1a8850f658602b85926&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://github.com/mongodb/mongo/commit/ee65b0cc858ffc1ef68ab1a8850f658602b85926&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="17333" author="alvin" created="Wed, 25 Aug 2010 00:31:50 +0000"  >&lt;p&gt;From Matt Levy&lt;/p&gt;


&lt;p&gt;The master log file showed the following:&lt;/p&gt;

&lt;p&gt;Tue Aug 24 17:08:58 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn10&amp;#93;&lt;/span&gt; insert choc.events 233ms&lt;br/&gt;
Tue Aug 24 17:08:58 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn23&amp;#93;&lt;/span&gt; insert choc.events 167ms&lt;br/&gt;
Tue Aug 24 17:08:58 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn13&amp;#93;&lt;/span&gt; insert choc.events 263ms&lt;br/&gt;
Tue Aug 24 17:08:58 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn13&amp;#93;&lt;/span&gt; insert choc.events 270ms&lt;br/&gt;
Tue Aug 24 17:08:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn23&amp;#93;&lt;/span&gt; insert choc.events 321ms&lt;br/&gt;
Tue Aug 24 17:08:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn10&amp;#93;&lt;/span&gt; insert choc.events 299ms&lt;br/&gt;
Tue Aug 24 17:08:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn13&amp;#93;&lt;/span&gt; insert choc.events 387ms&lt;br/&gt;
Tue Aug 24 17:09:00 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn13&amp;#93;&lt;/span&gt; insert choc.events 284ms&lt;br/&gt;
Tue Aug 24 17:09:06 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn12&amp;#93;&lt;/span&gt; Finding median for index: &lt;/p&gt;
{ _id: 1.0 }
&lt;p&gt; between { : &quot;5a135bd6-b074-c44f-e52e-6c4e57ffd7e1&quot; } and { : &quot;5c17839c-2da2-67d9-4eda-7fdec6063f4c&quot; } took 6292 ms.&lt;br/&gt;
Tue Aug 24 17:09:06 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn12&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: { medianKey: &quot;choc.events&quot;, keyPattern: &lt;/p&gt;
{ _id: 1.0 }
&lt;p&gt;, min: &lt;/p&gt;
{ _id: &quot;5a135bd6-b074-c44f-e52e-6c4e57ffd7e1&quot; }
&lt;p&gt;, max: &lt;/p&gt;
{ _id: &quot;5c17839c-2da2-67d9-4eda-7fdec6063f4c&quot; }
&lt;p&gt; } reslen:112 6585ms&lt;/p&gt;</comment>
                            <comment id="17230" author="auto" created="Tue, 24 Aug 2010 05:26:07 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;login&apos;: &apos;erh&apos;, &apos;name&apos;: &apos;Eliot Horowitz&apos;, &apos;email&apos;: &apos;eliot@10gen.com&apos;}
&lt;p&gt;Message: dataSize has an estimate option, chunk uses this &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-1545&quot; title=&quot;make single command for size and median&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-1545&quot;&gt;&lt;del&gt;SERVER-1545&lt;/del&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://github.com/mongodb/mongo/commit/9a9eb885349ea4644a30adebcb82f995764a9e88&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://github.com/mongodb/mongo/commit/9a9eb885349ea4644a30adebcb82f995764a9e88&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="17228" author="auto" created="Tue, 24 Aug 2010 02:53:21 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;login&apos;: &apos;alerner&apos;, &apos;name&apos;: &apos;Alberto Lerner&apos;, &apos;email&apos;: &apos;alerner@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-1545&quot; title=&quot;make single command for size and median&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-1545&quot;&gt;&lt;del&gt;SERVER-1545&lt;/del&gt;&lt;/a&gt; splitVector now takes ranges.&lt;br/&gt;
&lt;a href=&quot;http://github.com/mongodb/mongo/commit/9e951e98b44c7da200d60cc200735e6213d8ebe8&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://github.com/mongodb/mongo/commit/9e951e98b44c7da200d60cc200735e6213d8ebe8&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="17225" author="auto" created="Tue, 24 Aug 2010 01:06:47 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;login&apos;: &apos;alerner&apos;, &apos;name&apos;: &apos;Alberto Lerner&apos;, &apos;email&apos;: &apos;alerner@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-1545&quot; title=&quot;make single command for size and median&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-1545&quot;&gt;&lt;del&gt;SERVER-1545&lt;/del&gt;&lt;/a&gt; Fix test.&lt;br/&gt;
&lt;a href=&quot;http://github.com/mongodb/mongo/commit/cd9d7218227cebb8220c09aa6a78ac4cb0d2fd94&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://github.com/mongodb/mongo/commit/cd9d7218227cebb8220c09aa6a78ac4cb0d2fd94&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="17224" author="auto" created="Tue, 24 Aug 2010 00:00:31 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;login&apos;: &apos;alerner&apos;, &apos;name&apos;: &apos;Alberto Lerner&apos;, &apos;email&apos;: &apos;alerner@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-1545&quot; title=&quot;make single command for size and median&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-1545&quot;&gt;&lt;del&gt;SERVER-1545&lt;/del&gt;&lt;/a&gt; Leave new chunks half-full instead of at 90%.&lt;br/&gt;
&lt;a href=&quot;http://github.com/mongodb/mongo/commit/2af7fa477c6a0764e0af9def5ed7a2fdd0cdfe47&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://github.com/mongodb/mongo/commit/2af7fa477c6a0764e0af9def5ed7a2fdd0cdfe47&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="17223" author="auto" created="Mon, 23 Aug 2010 23:42:22 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;login&apos;: &apos;alerner&apos;, &apos;name&apos;: &apos;Alberto Lerner&apos;, &apos;email&apos;: &apos;alerner@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-1545&quot; title=&quot;make single command for size and median&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-1545&quot;&gt;&lt;del&gt;SERVER-1545&lt;/del&gt;&lt;/a&gt; Add a fast path to splitVector command.&lt;br/&gt;
&lt;a href=&quot;http://github.com/mongodb/mongo/commit/4bcf64d3d132acaf61580743a5d121bf4d90a002&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://github.com/mongodb/mongo/commit/4bcf64d3d132acaf61580743a5d121bf4d90a002&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10320">
                    <name>Documented</name>
                                                                <inwardlinks description="is documented by">
                                        <issuelink>
            <issuekey id="389449">DOCS-10339</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 23 Aug 2010 23:42:22 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        13 years, 23 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>emily.hall</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            13 years, 23 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>alerner</customfieldvalue>
            <customfieldvalue>alvin</customfieldvalue>
            <customfieldvalue>auto</customfieldvalue>
            <customfieldvalue>eliot</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrpi6v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hriiov:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>21680</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht0i3r:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>