<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 07:57:37 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[DOCS-9120] Explain that using the aggregation pipeline instead of collection.distinct() is a better equivalent to SQL &quot;SELECT DISTINCT x FROM ..&quot;</title>
                <link>https://jira.mongodb.org/browse/DOCS-9120</link>
                <project id="10380" key="DOCS">Documentation</project>
                    <description>&lt;p&gt;A customer who didn&apos;t know about or aggregation pipelines, or knew about it but maybe saw it as being too powerful, was trying to do the equivalent of &quot;SELECT DISTINCT field1 FROM ...&quot;.&lt;/p&gt;

&lt;p&gt;The documentation he used was &lt;a href=&quot;https://docs.mongodb.com/manual/reference/method/db.collection.distinct/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.com/manual/reference/method/db.collection.distinct/&lt;/a&gt;. I suspect he may have also read &lt;a href=&quot;https://docs.mongodb.com/manual/reference/sql-comparison/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.com/manual/reference/sql-comparison/&lt;/a&gt;, which gives the example of &quot;SELECT DISTINCT(status) FROM users&quot; == &quot;db.users.distinct( &quot;status&quot; )&quot;&lt;/p&gt;

&lt;p&gt;He ran into the problem that he had hundreds of thousands of distinct values in the collection he was examining and this lead to &lt;em&gt;&quot;BSONObj size: 25448908 (0x18451CC) is invalid. Size must be between 0 and 16793600(16MB)&quot;&lt;/em&gt; when he tried to use the distinct command. That is the distinct values weren&apos;t a small set, so the result size exceeded 16MB, so the packing of array result value threw an exception when it reached max BSON object size.&lt;/p&gt;

&lt;p&gt;Neither the db.collection.distinct page or &lt;a href=&quot;https://docs.mongodb.com/manual/reference/command/distinct/#dbcmd.distinct&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.com/manual/reference/command/distinct/#dbcmd.distinct&lt;/a&gt; for the underlying db command highlight that the single array result means a 16mb result size limit. Nor do they explain to avoid that and get a cursor instead (like a find command, which is the equivalent of the SQL SELECT command) you should use an aggregation pipeline per the example at &lt;a href=&quot;https://docs.mongodb.com/manual/reference/operator/aggregation/group/#retrieve-distinct-values&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.com/manual/reference/operator/aggregation/group/#retrieve-distinct-values&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I think it&apos;s time we start describing the distinct command / wrapper method as a &lt;b&gt;convenience method with limited result size&lt;/b&gt; and point to the aggregation pipeline method ( &lt;em&gt;&quot;db.myCollection.aggregate( [ { $group : { _id : &quot;$myField&quot; } } ] )&quot;&lt;/em&gt; ) as the fundamental way of fetching distinct values.&lt;/p&gt;</description>
                <environment></environment>
        <key id="322324">DOCS-9120</key>
            <summary>Explain that using the aggregation pipeline instead of collection.distinct() is a better equivalent to SQL &quot;SELECT DISTINCT x FROM ..&quot;</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="allison.moore@mongodb.com">Allison Reinheimer Moore</assignee>
                                    <reporter username="akira.kurogane">Akira Kurogane</reporter>
                        <labels>
                    </labels>
                <created>Sun, 9 Oct 2016 22:00:59 +0000</created>
                <updated>Mon, 30 Oct 2023 21:23:42 +0000</updated>
                            <resolved>Wed, 31 Jan 2018 14:24:25 +0000</resolved>
                                                    <fixVersion>Server_Docs_20231030</fixVersion>
                                    <component>manual</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="1791463" author="xgen-internal-githook" created="Thu, 1 Feb 2018 05:09:41 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;email&apos;: &apos;allison.moore@10gen.com&apos;, &apos;name&apos;: &apos;Allison Moore&apos;, &apos;username&apos;: &apos;schmalliso&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/DOCS-9120&quot; title=&quot;Explain that using the aggregation pipeline instead of collection.distinct() is a better equivalent to SQL &amp;quot;SELECT DISTINCT x FROM ..&amp;quot;&quot; class=&quot;issue-link&quot; data-issue-key=&quot;DOCS-9120&quot;&gt;&lt;del&gt;DOCS-9120&lt;/del&gt;&lt;/a&gt;: emphasize BSON limit on distinct, propose agg alternative&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/docs/commit/00a3e5b99ccc47afced56c65685a087e7349b6eb&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/docs/commit/00a3e5b99ccc47afced56c65685a087e7349b6eb&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="1789859" author="akira.kurogane" created="Tue, 30 Jan 2018 22:14:12 +0000"  >&lt;p&gt;Thanks Allison. Please excuse the nitpicking. Code review LGTM&apos;ed just now.&lt;/p&gt;</comment>
                            <comment id="1786614" author="akira.kurogane" created="Fri, 26 Jan 2018 23:57:24 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=allison.moore&quot; class=&quot;user-hover&quot; rel=&quot;allison.moore&quot;&gt;allison.moore&lt;/a&gt;. Code review update for &lt;a href=&quot;https://mongodbcr.appspot.com/177940001/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://mongodbcr.appspot.com/177940001/&lt;/a&gt; made.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_13552" key="com.go2group.jira.plugin.crm:crm_generic_field">
                        <customfieldname>Case</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[[500A000000VDmtGIAT, 500A000000Wn1uoIAB]]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 25 Jan 2018 16:45:19 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 2 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_14876" key="com.atlassian.jira.plugin.system.customfieldtypes:userpicker">
                        <customfieldname>Docs Reviewer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>rob.justice@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>emet.ozar@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 2 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>akira.kurogane</customfieldvalue>
            <customfieldvalue>allison.moore@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrmezb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hsqnfz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="1324">KANBAN BUCKET</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10555" key="com.atlassian.jira.plugin.system.customfieldtypes:float">
                        <customfieldname>Story Points</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0.25</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrz3x3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                </customfields>
    </item>
</channel>
</rss>