<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:45:34 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-37293] Refactor Sorter so that DocumentSourceGroup can use it instead of SortedFileWriter implementation</title>
                <link>https://jira.mongodb.org/browse/SERVER-37293</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;document_source_group.h/cpp skips the Sorter interface and re-implements some of the logic in order to use the SortedFileWriter class directly. This appear to have been done to avoid the sorting phase of Sorter because DocumentSourceGroup has already sorted data.&lt;/p&gt;

&lt;p&gt;Therefore, we should expose an option on the Sorter to handle pre-sorted data, and remove the duplicate code from document_source_group.h/cpp. Additionally, document_source_group.h/cpp forces sorter.h to make internal structs public that would otherwise be private to sorter.cpp because it needs to use them directly.  FileDeleter and SorterFileInfo (added by &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-17010&quot; title=&quot;Reduce file handle usage in File based Sorter&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-17010&quot;&gt;&lt;del&gt;SERVER-17010&lt;/del&gt;&lt;/a&gt;) should be moved into the cpp file along with the interface change.&lt;/p&gt;</description>
                <environment></environment>
        <key id="609131">SERVER-37293</key>
            <summary>Refactor Sorter so that DocumentSourceGroup can use it instead of SortedFileWriter implementation</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="12300">Won&apos;t Do</resolution>
                                        <assignee username="dan.solnik@mongodb.com">Daniel Solnik</assignee>
                                    <reporter username="dianna.hohensee@mongodb.com">Dianna Hohensee</reporter>
                        <labels>
                            <label>neweng</label>
                    </labels>
                <created>Mon, 24 Sep 2018 17:55:48 +0000</created>
                <updated>Fri, 21 Jun 2019 13:42:23 +0000</updated>
                            <resolved>Fri, 21 Jun 2019 13:41:37 +0000</resolved>
                                                                    <component>Storage</component>
                                        <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="2292790" author="dianna.hohensee" created="Fri, 21 Jun 2019 13:41:37 +0000"  >&lt;p&gt;This change would require significant query code changes, which also could potentially affect performance, so we are not going to do it. DocumentSourceGroup maintains a a private &lt;a href=&quot;https://github.com/mongodb/mongo/blob/874f1e2c64c1500e17c79b86fb61d7777541cf39/src/mongo/db/pipeline/document_source_group.h#L92&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;GroupMap&lt;/a&gt; (&lt;a href=&quot;https://github.com/mongodb/mongo/blob/874f1e2c64c1500e17c79b86fb61d7777541cf39/src/mongo/db/pipeline/document_source_group.h#L256&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;_groups&lt;/a&gt;) that &lt;a href=&quot;https://github.com/mongodb/mongo/blob/874f1e2c64c1500e17c79b86fb61d7777541cf39/src/mongo/db/pipeline/document_source_group.cpp#L470&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;DocumentSourceGroup::initialize()&lt;/a&gt; fills from a &lt;a href=&quot;https://github.com/mongodb/mongo/blob/874f1e2c64c1500e17c79b86fb61d7777541cf39/src/mongo/db/pipeline/document_source_group.cpp#L474&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;pSource&lt;/a&gt; and periodically &lt;a href=&quot;https://github.com/mongodb/mongo/blob/874f1e2c64c1500e17c79b86fb61d7777541cf39/src/mongo/db/pipeline/document_source_group.cpp#L481&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;spills to disk when it reaches a certain size&lt;/a&gt;. DocumentSourceGroup::spill() performs its own stable_sort on the contents of _group (that looks like it is sorted by an id) and then translates the _group data structure entries into a format for disk. Then DocumentSourceGroup::spill() uses a SortedFileWriter and SortedFileWriter::addAlreadySorted() to spill to disk. It would be difficult to move this functionality into the Sorter interface, and may hurt performance, as Dan described in more detail above.&lt;/p&gt;</comment>
                            <comment id="2289857" author="dan.solnik" created="Wed, 19 Jun 2019 14:52:30 +0000"  >&lt;p&gt;It seems like the DocumentSourceGroup has two separate codepaths for depending on whether or not the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/46e086d2093798cdec949eba2919ccac88719166/src/mongo/db/pipeline/document_source_group.h#L253&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;groups&lt;/a&gt;&#160;were spilled to disk.&lt;br/&gt;
 When the data is spilled to disk it is serialized (the accumulators put into their value form) and so it needs to be deserialized later.&lt;br/&gt;
 However, if the data is never spilled to disk it is never serialized and so there is no need to deserialize it.&lt;br/&gt;
 This can be seen in &lt;a href=&quot;https://github.com/mongodb/mongo/blob/46e086d2093798cdec949eba2919ccac88719166/src/mongo/db/pipeline/document_source_group.cpp#L586&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;DocumentSourceGroup::spill()&lt;/a&gt;&#160;(where the data is serialized into value form) and in &lt;a href=&quot;https://github.com/mongodb/mongo/blob/46e086d2093798cdec949eba2919ccac88719166/src/mongo/db/pipeline/document_source_group.cpp#L135&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;DocumentSourceGroup::getNext()&lt;/a&gt;&#160;(which uses a different iteration method depending on whether or not it needs to deserialize, either &lt;a href=&quot;https://github.com/mongodb/mongo/blob/46e086d2093798cdec949eba2919ccac88719166/src/mongo/db/pipeline/document_source_group.cpp#L157&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;getNextSpilled&lt;/a&gt;&#160;or &lt;a href=&quot;https://github.com/mongodb/mongo/blob/46e086d2093798cdec949eba2919ccac88719166/src/mongo/db/pipeline/document_source_group.cpp#L191&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;getNextStandard&lt;/a&gt;).&lt;br/&gt;
 This class executes different code in the case when sorting spills to disk and when sorting can be done in memory, which the current Sorter interface doesn&#8217;t support.&lt;/p&gt;

&lt;p&gt;One way to use the sorter interface in DocumentSourceGroup is by always serializing the accumulators and then adding the (id, accumulators) and using the sorter to sort by id and then deserializing the results from the sorter. However, this change would result in a possible perf hit when a spill to disk is not necessary (when the groups can be kept in memory) as currently this serialization is not done if the groups can be kept in memory.&#160;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10020">
                    <name>Gantt Dependency</name>
                                                                <inwardlinks description="has to be done after">
                                        <issuelink>
            <issuekey id="180264">SERVER-17010</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 19 Jun 2019 14:52:30 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 33 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>dianna.hohensee@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 33 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>dan.solnik@mongodb.com</customfieldvalue>
            <customfieldvalue>dianna.hohensee@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hu8q6n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr787z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="2984">Execution Team 2019-06-17</customfieldvalue>
    <customfieldvalue id="3034">Execution Team 2019-07-01</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hu8cfz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>