<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:32:57 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-54220] Implement $sample fallback in $_internalUnpackBucket.</title>
                <link>https://jira.mongodb.org/browse/SERVER-54220</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;If the heuristics are below a determined threshold, the $sample pushdown algorithm should default to the top-k sorting algorithm used in $sample currently. This would unpack all of the buckets and proceed with the top-k sorting algo.&lt;/p&gt;


&lt;p&gt;We should also explore the heuristic space, determine a minimal set of significant variables and implement a threshold switch to toggle the fall-back or optimized $sample algorithm in $_internalUnpackBucket.&#160;&lt;/p&gt;</description>
                <environment></environment>
        <key id="1609517">SERVER-54220</key>
            <summary>Implement $sample fallback in $_internalUnpackBucket.</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="david.storch@mongodb.com">David Storch</assignee>
                                    <reporter username="eric.cox@mongodb.com">Eric Cox</reporter>
                        <labels>
                            <label>qexec-team</label>
                    </labels>
                <created>Tue, 2 Feb 2021 21:17:02 +0000</created>
                <updated>Sun, 29 Oct 2023 21:58:05 +0000</updated>
                            <resolved>Mon, 12 Apr 2021 21:42:28 +0000</resolved>
                                    <version>5.0 Required</version>
                                    <fixVersion>5.0.0-rc0</fixVersion>
                                    <component>Querying</component>
                                        <votes>0</votes>
                                    <watches>2</watches>
                                                                                                                <comments>
                            <comment id="3714113" author="xgen-internal-githook" created="Mon, 12 Apr 2021 21:17:16 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;Eric Cox&apos;, &apos;email&apos;: &apos;eric.cox@mongodb.com&apos;, &apos;username&apos;: &apos;ericox&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-54220&quot; title=&quot;Implement $sample fallback in $_internalUnpackBucket.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-54220&quot;&gt;&lt;del&gt;SERVER-54220&lt;/del&gt;&lt;/a&gt; Change heuristic for bailing out of SAMPLE_FROM_TIMESERIES_BUCKET plan&lt;/p&gt;

&lt;p&gt;If the requested sample size for a $sample against a timeseries&lt;br/&gt;
collection exceeds 1% of the maximum possible number of measurements&lt;br/&gt;
(numBuckets * maxMeasurementsPerBucket), then we never consider using&lt;br/&gt;
the ARHASH sampling algorithm.&lt;/p&gt;

&lt;p&gt;Co-authored-by: David Storch &amp;lt;david.storch@mongodb.com&amp;gt;&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/4741ef2d5ca6a0809c2569cc55142bb1f7ed7547&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/4741ef2d5ca6a0809c2569cc55142bb1f7ed7547&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10320">
                    <name>Documented</name>
                                                                <inwardlinks description="is documented by">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 9 Apr 2021 16:02:29 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        2 years, 43 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_17052" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Downstream Changes Summary</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>This is not exactly a change, but rather an aspect of the new time-series collection feature that is relevant to downstream teams, in particular enterprise tools. This ticket, along with others, implemented an optimized algorithm for sampling from a time-series collection with $sample. The optimized algorithm relies on the storage engine&amp;#39;s random cursor capability.&lt;br/&gt;
&lt;br/&gt;
This ticket introduced the rule that if the requested $sample exceeds 0.01 * numBuckets * timeseriesBucketMaxCount, then the system will not consider the optimized plan. See related tickets &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-54221&quot; title=&quot;Implement $sample pushdown into $_internalUnpackBucket &quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-54221&quot;&gt;&lt;strike&gt;SERVER-54221&lt;/strike&gt;&lt;/a&gt; and &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-55215&quot; title=&quot;Handle small measurement counts in buckets for ARHASH&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-55215&quot;&gt;&lt;strike&gt;SERVER-55215&lt;/strike&gt;&lt;/a&gt;. </customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16942"><![CDATA[Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-1945</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            2 years, 43 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>david.storch@mongodb.com</customfieldvalue>
            <customfieldvalue>eric.cox@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hysti7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hyeepj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="4466">Query Execution 2021-03-22</customfieldvalue>
    <customfieldvalue id="4468">Query Execution 2021-04-05</customfieldvalue>
    <customfieldvalue id="4470">Query Execution 2021-04-19</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_17051" key="com.atlassian.jira.plugin.system.customfieldtypes:multicheckboxes">
                        <customfieldname>Teams Impacted</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16944"><![CDATA[Docs]]></customfieldvalue>
    <customfieldvalue key="20960"><![CDATA[DBX: DevTools (Compass, Shell, VS Code Ext)]]></customfieldvalue>
    <customfieldvalue key="20959"><![CDATA[Charts]]></customfieldvalue>
    <customfieldvalue key="20961"><![CDATA[SQL Engines (Atlas SQL + BIC)]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hysfrb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>