<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 06:14:01 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-69646] [Optimization] Consider making analyzeShardKey command calculate correlation coefficient in batches</title>
                <link>https://jira.mongodb.org/browse/SERVER-69646</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;The check for monotonicity in the analyzeShardKey command currently relies on &lt;a href=&quot;https://github.com/10gen/mongo/blob/9938a453d8de4c52d8de2080caaa9fe9405442ba/src/mongo/db/s/analyze_shard_key_cmd_util.cpp#L350-L386&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;calculating the correlation coefficient&lt;/a&gt; between the WiredTiger RecordIds (i.e. y) in the index store and&#160; 1, ..., N (i.e. x) where N is the number of recordIds. Given that each RecordId is 8-byte, for a collection has 10 million unique shard key values, the check would involve storing ~2 * 80MB of x and y in memory. While the check should be fast, the memory usage can still have a non-negligible impact on the server. To this end, we should consider calculating the correlation coefficient in batches of size N&apos; where N&apos; is more manageable. This paper describes a way average correlation coefficients (r), specifically by transforming the r values using a Fisher&apos;s z transformation and then taking the average of the z values and converting it back to an r value.&lt;/p&gt;</description>
                <environment></environment>
        <key id="2135377">SERVER-69646</key>
            <summary>[Optimization] Consider making analyzeShardKey command calculate correlation coefficient in batches</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="1" iconUrl="https://jira.mongodb.org/images/icons/statuses/open.png" description="">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="backlog-server-cluster-scalability">Backlog - Cluster Scalability</assignee>
                                    <reporter username="cheahuychou.mao@mongodb.com">Cheahuychou Mao</reporter>
                        <labels>
                    </labels>
                <created>Tue, 13 Sep 2022 16:04:45 +0000</created>
                <updated>Tue, 12 Dec 2023 16:00:29 +0000</updated>
                                                                                                <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="5478623" author="JIRAUSER1261874" created="Tue, 6 Jun 2023 16:29:38 +0000"  >&lt;p&gt;I have lost the original reference I found, but I could find [META-ANALYSIS OF CORRELATION&lt;br/&gt;
COEFFICIENTS: A MONTE CARLO COMPARISON&lt;br/&gt;
OF FIXED- AND RANDOM-EFFECTS METHODS |&lt;a href=&quot;https://core.ac.uk/download/pdf/2708611.pdf&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://core.ac.uk/download/pdf/2708611.pdf&lt;/a&gt;.]&lt;/p&gt;

&lt;p&gt;and [A note on combining correlations|&lt;a href=&quot;https://link.springer.com/content/pdf/10.3758/BF03334158.pdf&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://link.springer.com/content/pdf/10.3758/BF03334158.pdf&lt;/a&gt;|https://link.springer.com/content/pdf/10.3758/BF03334158.pdf].=] which describe a few ways this can be accomplished.&lt;/p&gt;

&lt;p&gt;Note that the Z-transform method mentioned in the description probably refers to the Silver &amp;amp; Dunlap&#160; (1987) paper.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="2111911">SERVER-68753</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="26583"><![CDATA[Cluster Scalability]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 6 Jun 2023 16:29:38 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        35 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>dbeng-pm-bot</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            35 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>adi.zaimi@mongodb.com</customfieldvalue>
            <customfieldvalue>backlog-server-cluster-scalability</customfieldvalue>
            <customfieldvalue>cheahuychou.mao@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i1a0fb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i0qjvg:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i19mkn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>