<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:46:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-18210] Add query for document structure</title>
                <link>https://jira.mongodb.org/browse/SERVER-18210</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Please add a query that will return the &quot;structure&quot; of the document, instead of the data. The structure will allow application to understand the data that is embeded in the document, and then construct a query that will be efficient.&lt;/p&gt;

&lt;p&gt;In concept, this is similar to the JDBC meta data query.&lt;/p&gt;

&lt;p&gt;Given the schema-less of Mongo, I&apos;m not sure that there is a perfect solution. The basic idea is to summarize the data. In theory, constructing a JSON schema from the data will work, but this is unrealistic approach.&lt;/p&gt;

&lt;p&gt;Some ideas:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Replace STRING values with &quot;STRING(N)&quot; (N is size)&lt;/li&gt;
	&lt;li&gt;Replace ARRAY with ARRAY of TYPE. If all entries have same type (e.g., number, string) it will be ARRAY[N] of type. Otherwise ARRAY[N] of object.&lt;/li&gt;
	&lt;li&gt;Replace BLOB with BLOB&amp;#40;n&amp;#41;.&lt;/li&gt;
	&lt;li&gt;Replace array of OBJECTS, with ARRAYS of ([ list of available attributes] )&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Motivation:&lt;br/&gt;
When processing documents from existing repositories when the full structure is unknown, applications are forced to load complete documents, just to find out what data is available.&lt;/p&gt;

&lt;p&gt;Having the ability to get (some) meta data, will reduce the amount of data that is loaded by a factor of 100X for our application.&lt;/p&gt;</description>
                <environment></environment>
        <key id="200291">SERVER-18210</key>
            <summary>Add query for document structure</summary>
                <type id="2" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14711&amp;avatarType=issuetype">New Feature</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="backlog-server-query">Backlog - Query Team</assignee>
                                    <reporter username="yair.lenga@gmail.com">Yair Lenga</reporter>
                        <labels>
                            <label>expression</label>
                            <label>stage</label>
                    </labels>
                <created>Sat, 25 Apr 2015 11:13:22 +0000</created>
                <updated>Tue, 6 Dec 2022 04:52:32 +0000</updated>
                            <resolved>Mon, 18 Dec 2017 16:12:37 +0000</resolved>
                                                                    <component>Aggregation Framework</component>
                    <component>Querying</component>
                                        <votes>0</votes>
                                    <watches>15</watches>
                                                                                                                <comments>
                            <comment id="1626597" author="asya" created="Wed, 19 Jul 2017 17:53:44 +0000"  >&lt;p&gt;As we now have $objectToArray and $arrayToObject expressions along with $type, $size, $strLenCP, etc. I think this ticket can be closed.&lt;/p&gt;</comment>
                            <comment id="1196337" author="charlie.swanson" created="Tue, 8 Mar 2016 15:07:29 +0000"  >&lt;p&gt;I think a better way to achieve this desired outcome would be to provide a way to get the keys out of an object, and possibly to reconstruct an object. If you can manipulate the field names and the corresponding values of an object, then the rest of the summarization could be done using &lt;tt&gt;$size&lt;/tt&gt;, &lt;tt&gt;$strLen&lt;/tt&gt; (code points or bytes, see &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-14670&quot; title=&quot;Add expressions to determine the length of a string&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-14670&quot;&gt;&lt;del&gt;SERVER-14670&lt;/del&gt;&lt;/a&gt;), or &lt;tt&gt;$type&lt;/tt&gt; (&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-13447&quot; title=&quot;provide $projection operator to get type of field&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-13447&quot;&gt;&lt;del&gt;SERVER-13447&lt;/del&gt;&lt;/a&gt;). For example, I think this could be accomplished by something like these expressions:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;tt&gt;$unwindObject&lt;/tt&gt; (work for a similar expression is being tracked under &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-11392&quot; title=&quot;$unwind on subdocuments&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-11392&quot;&gt;&lt;del&gt;SERVER-11392&lt;/del&gt;&lt;/a&gt;) - Takes an object and returns a vector of tuples (key, value) for the object.&lt;/li&gt;
	&lt;li&gt;&lt;tt&gt;$constructObject&lt;/tt&gt; - Takes a vector of tuples (key, value) and constructs an object.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;With those expressions, one could unwind an object, then do a &lt;tt&gt;$map&lt;/tt&gt; over the (key, value) pairs, replacing the value with some summary of the value, then reconstruct the object with the new values.&lt;/p&gt;

&lt;p&gt;Obviously I haven&apos;t fully thought through what those would look like, but I think those would be more generally useful than an expression to summarize the data.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=asya&quot; class=&quot;user-hover&quot; rel=&quot;asya&quot;&gt;asya&lt;/a&gt; does this sound reasonable to you?&lt;br/&gt;
&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=yair.lenga%40gmail.com&quot; class=&quot;user-hover&quot; rel=&quot;yair.lenga@gmail.com&quot;&gt;yair.lenga@gmail.com&lt;/a&gt; can you confirm this would satisfy your use case?&lt;/p&gt;</comment>
                            <comment id="1193833" author="ramon.fernandez" created="Sat, 5 Mar 2016 00:10:14 +0000"  >&lt;p&gt;For those watching the ticket without knowledge of JIRA and our use of it, this is to let you know that this feature request has been sent to the Query team for consideration in their next round of planning. Any updates to this state will be posted on this ticket.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="898068" author="nevi_me" created="Sat, 25 Apr 2015 13:50:24 +0000"  >&lt;p&gt;I agree with you on the &lt;span class=&quot;error&quot;&gt;&amp;#91;Implement X in server so we don&amp;#39;t implement it ourselves in client&amp;#93;&lt;/span&gt;, and I think there are a number of other features which we as users would love to see. One approach to this is for the server-side scripting to be improved/overhauled with something that could allow us to create procedures/functions in server to achieve what we want. Otherwise 3 years down the line we&apos;ll have lots of arbitrary functions that shouldn&apos;t be in core, or people doing what you&apos;re trying to achieve not getting requested features and moving elsewhere.&lt;/p&gt;

&lt;p&gt;I think the &lt;span class=&quot;error&quot;&gt;&amp;#91;server-side scripting&amp;#93;&lt;/span&gt; bit would reduce the load on the Mongo team in the long-run because they wouldn&apos;t need to maintain a lot of extra functions.&lt;/p&gt;

&lt;p&gt;I would love to move a number of my scripts from the app into the server so I remove the network time cost that I incur every time I run certain queries&lt;/p&gt;</comment>
                            <comment id="898058" author="yair.lenga@gmail.com" created="Sat, 25 Apr 2015 13:20:51 +0000"  >&lt;p&gt;I believe that the major benefit of this feature is reduction in the amount of data that will be TRANSFERRED from the server to the client. By moving the functionality into the MongoDB server, the same amount of work will reduce the size of the data.&lt;/p&gt;

&lt;p&gt;I agree that detecting the structure of the whole collection is big. I hope to have the ability to summarize the structure a small subset, one document at a time. For my cases (large time series embedded in ~4MB documents), the metadata to describe the 4MB set was less then 4K (with int&lt;span class=&quot;error&quot;&gt;&amp;#91;5000&amp;#93;&lt;/span&gt; in the meta data, representing ~32K of int  data). I will be happy with this saving, leaving the much harder problem (metadata for the collection as a whole) for the future.&lt;/p&gt;

&lt;p&gt;For server-based application, where the bulk of the processing is done on the Web server (Java, in my case), performing the processing in MongoDB will reduce the time for encoding, network transfer, decoding and memory requirement of the document. In my case, we noticed that this transfer/parse time is where most the time is spent. Reading the data in Mongo seems to be a small fraction of the data.&lt;/p&gt;

&lt;p&gt;For Browser-based appliation, where the data is transported thru the internet, into Javascript application, I believe the saving is going to be significantly higher, as network transfer rates are order of magnitude slower vs, server to mongodo connection, in addition to encryption/decription code.&lt;/p&gt;

&lt;p&gt;As far as caching, I can only comment about my planned usage - creating large time-based data sets: the calls are performed in response to interactive user requests against ranges that the user need. The critical thing is the time to deliver that data. It is unlikely the different users will ask for the structure of the same documents - I&apos;m not sure if caching will help my specific application.&lt;/p&gt;</comment>
                            <comment id="898050" author="nevi_me" created="Sat, 25 Apr 2015 12:33:56 +0000"  >&lt;p&gt;Won&apos;t this force Mongo to also load complete documents (entire disk read?) in order to return the structure? What happens when there are say 1 million documents with some differences to the extent that 10 or more different schemas exist?&lt;/p&gt;

&lt;p&gt;If it is possible to efficiently create the feature, the structure could perhaps be cached so that an entire collection scan is not performed each time the query is run.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="200288">SERVER-18207</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="200289">SERVER-18208</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="200290">SERVER-18209</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="128832">SERVER-13447</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25143"><![CDATA[Query]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Sat, 25 Apr 2015 12:33:56 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 30 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>alexander.golin@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 30 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>asya.kamsky@mongodb.com</customfieldvalue>
            <customfieldvalue>backlog-server-query</customfieldvalue>
            <customfieldvalue>charlie.swanson@mongodb.com</customfieldvalue>
            <customfieldvalue>nevi_me</customfieldvalue>
            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
            <customfieldvalue>yair.lenga@gmail.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrl7ef:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr9dmv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrir9r:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>