[SERVER-828] Support for selecting array elements in return specifier (projection) Created: 25/Mar/10  Updated: 06/Apr/23  Resolved: 15/Jun/12

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 2.1.2

Type: New Feature Priority: Major - P3
Reporter: Michael Dirolf Assignee: Ben Becker
Resolution: Done Votes: 172
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-831 Positional Operator Matching Nested A... Closed
depends on SERVER-1243 New operator to update all matching i... Closed
is depended on by DOCS-422 Add doc for $elemMatch operator Closed
Duplicate
is duplicated by SERVER-11347 Dot notation in Projections not working Closed
is duplicated by SERVER-3094 Applying Match Criteria to Selected F... Closed
is duplicated by SERVER-1608 Retrieving only selected document(s) ... Closed
Related
related to SERVER-1013 positional $ operator field mismatch Closed
related to SERVER-7785 Cannot project embedded object field ... Closed
related to SERVER-142 Read-only views over collection data. Closed
related to SERVER-2238 New projection operator $elemMatch Closed
is related to SERVER-447 new aggregation framework Closed
is related to SERVER-3089 Ability to make use of a subdocument'... Closed
Participants:

 Description   

useful when querying on an embedded array and only wanting the matching element returned.

leaving to eliot to decide version



 Comments   
Comment by akilesh heerah [ 26/Oct/20 ]

@Joseff Betancourt

 Hello , 

Have you able to find a solution to return all the matches element ?

Comment by Dissatisfied Former User [ 28/Mar/13 ]

The implemented solution is not the solution to this ticket. The user expectation here was of, in effect, handling the child array as a collection, returning multiple filtered results from that "child collection". The implemented solution works for the translation use case, but not for the use case of returning a document whose child array contains only active-flagged sub-documents. (My use case.)

Comment by Adam Crabtree [ 28/Jan/13 ]

Sorry if this is not an appropriate place for comment, but the description for the issue and the implemented solution seem to be a mismatch. I think what most people are hoping for is something like the $elemMatch projection, but for all matching array elements, not just the first. The reason it seems confusing and inconsistent probably has something to do with the $elemMatch selector returning multiple documents that match, whereas the projection returns only 1 element that matches.

Confusion aside, is there any plan or existing issues that could be linked to to allow projections for array subsets?

Comment by Joseff Betancourt [ 19/Oct/12 ]

$elemMatch only works on first match... what if I wanted all matches in the area field of following example. Should there be an option for pull all results?

_id: ObjectId(),
someField: "Value",
0: [

{ city: "Manhattan", area: "New York City" }

],
1: [

{ city: "Bronx", area: "New York City" }

,
]
2: [

{ city: "Boston", area: "New England" }

]
3: [

{ city: "Providence", area: "New England" }

]
4: [

{ city: "Newton", area: "New England" }

]
}

Comment by Scott Hernandez (Inactive) [ 31/Aug/12 ]

Docs for $elemMatch (projection) are here:
http://docs.mongodb.org/manual/reference/projection/elemMatch/

Comment by auto [ 30/Jul/12 ]

Author:

{u'date': u'2012-07-27T20:11:08-07:00', u'email': u'jeff.yemin@10gen.com', u'name': u'Jeff Yemin'}

Message: More tests for SERVER-828 and SERVER-2238
Branch: master
https://github.com/mongodb/mongo/commit/83ec59844bdd629b2b32a9791a4e7a0e93516c02

Comment by auto [ 15/Jun/12 ]

Author:

{u'date': u'2012-06-15T15:04:27-07:00', u'email': u'ben.becker@10gen.com', u'name': u'Ben Becker'}

Message: SERVER-828: additional test and cleanup
Branch: master
https://github.com/mongodb/mongo/commit/372ef0946ff5dae940ebc518c85bdbd6338535cf

Comment by auto [ 15/Jun/12 ]

Author:

{u'date': u'2012-06-15T15:04:27-07:00', u'email': u'ben.becker@10gen.com', u'name': u'Ben Becker'}

Message: SERVER-828: additional test and cleanup
Branch: master
https://github.com/mongodb/mongo/commit/372ef0946ff5dae940ebc518c85bdbd6338535cf

Comment by auto [ 15/Jun/12 ]

Author:

{u'date': u'2012-06-15T15:04:27-07:00', u'email': u'ben.becker@10gen.com', u'name': u'Ben Becker'}

Message: SERVER-828: additional test and cleanup
Branch: master
https://github.com/mongodb/mongo/commit/372ef0946ff5dae940ebc518c85bdbd6338535cf

Comment by auto [ 15/Jun/12 ]

Author:

{u'date': u'2012-06-15T15:04:27-07:00', u'email': u'ben.becker@10gen.com', u'name': u'Ben Becker'}

Message: SERVER-828: additional test and cleanup
Branch: master
https://github.com/mongodb/mongo/commit/372ef0946ff5dae940ebc518c85bdbd6338535cf

Comment by auto [ 15/Jun/12 ]

Author:

{u'date': u'2012-06-15T10:47:32-07:00', u'email': u'ben.becker@10gen.com', u'name': u'Ben Becker'}

Message: SERVER-828: updated tests for sharding
Branch: master
https://github.com/mongodb/mongo/commit/51222ca1bb0a7464bfe2039545e9e91b8f6e6c81

Comment by Ben Becker [ 15/Jun/12 ]

Note that SERVER-831 and SERVER-1243 are required for multiple arrays in the query specifier and multiple positional ($) operators in the projection specifier.

Comment by Ben Becker [ 15/Jun/12 ]

Pushed while JIRA/aws was down:
https://github.com/mongodb/mongo/commit/fb66c84bc7bc1ece63a65766bfea2f797f3b7121

Comment by Colin Mollenhour [ 22/Feb/12 ]

@free, I'm fully aware that MongoDb is schema-free, I'm not sure what you're trying to say..

The JIRA is not a support forum so please continue discussion on the google group if needed, but here is the answer to your question:

db.foo.aggregate(
  {$match:   {"task.status": 1}},                    // filter parent documents
  {$unwind:  "$task"},                               // unwind the embedded docs for filtering
  {$match:   {"task.status": 1}},                    // filter subdocs
  {$group:   {_id: "$_id", task: {$push: "$task"}}   // group subdocs back into array in parent doc
);

Note that while the first $match is not necessary it will improve performance if either task.status is indexed or it is common that the set of documents that match will be smaller that the full set. See docs for explanation.

I could be wrong, but the second match will not use an index although this could possibly be optimized later. Chris?

Comment by free [ 22/Feb/12 ]

@Colin Mollenhour
Given such docs
{_id:1,a:1,b:2,c:3,task:[

{status:1}

,

{status:2}

]}
{_id:1,a1:1,b1:2,xy:3,task:[

{status:1}

,

{status:2}

]}
{_id:1,a2:1,b2:2,mn:3,task:[

{status:1}

,

{status:2}

]}
If my wanting is to find all docs whose task.status=1,return doc and qualified subdoc,
how to write aggregate command?

It's scheme-free mongodb,not mysql

Comment by Colin Mollenhour [ 21/Feb/12 ]

@free
The parent doc field is _id. Not sure what you're wanting, but that's the beauty of using the aggregate command, you can shape the results however you want.

Regarding performance I haven't done any large tests, but my basic test took 0ms (used shell to run command then profiler to get time in ms) so at least the overhead isn't high. I'm sure the aggregation framework will only continue to get faster as well.

The primary concern to consider when using the aggregation framework is the size limit of the returned document (technically the results must fit in one document which is currently 16mb limit). However as long as your queries are sane I don't think this will be a problem.

Here is a working test script (requires 2.1.0): https://gist.github.com/1879457

Comment by free [ 17/Feb/12 ]

@Colin Mollenhour
Your aggregate's result has no parent doc field ,just has _id and embedded docs.
And the performance maybe not good enough.

Comment by Colin Mollenhour [ 14/Feb/12 ]

There are some caveats with this feature request. Some examples:

If you want to filter the documents with one criteria (say "status") and the embedded documents by another (say "comments.timestamp") you run into the situation where maybe a document doesn't have any embedded documents that match but you still want the parent document in the results.

Another is if you want to exclude certain fields like "changelog" but use the positional operator to include only matched embedded docs. Currently mixing includes and excludes is prohibited so you'd have to convert your excludes to includes on the application side.

Thankfully, we have the aggregation framework now (2.1) which would let you filter parent documents independently of embedded documents and also use include/exclude on the parent document fields while still returning only matched embedded documents. The output can be made to match exactly what this feature request is asking for with far greater flexibility.

Example:

db.foo.aggregate(
  {$match:   {status: "published"}},                        // filter parent documents
  {$project: {changelog: 0}},                               // exclude fields from parent
  {$unwind:  "$comments"},                                  // unwind the embedded docs for filtering
  {$match:   {$or:[{comments:{$exists:false}},              // match documents with no comments
    {"comments.timestamp": {$gte: lastWeek}}]},             // filter comments
  {$group:   {_id: "$_id", comments: {$push: "$comments"}}  // group comments back into parent doc
);

So is this feature request now obsolete?

Comment by Rodrigo Coelho [ 05/Jan/12 ]

Would be nice for my use case: multilanguage documents.
Support for $in would be even better, for fallback support to the default language.

Comment by Flavio Percoco [ 26/Sep/11 ]

@alex

That's something you'll have to be aware of. If you have a collection that has documents with such arrays you'll have to be careful when retrieving those fields. AFAIT

Comment by Alex Fomin [ 03/Jul/11 ]

And how about perfomance?
Will the perfomance loss for the find queries be noticeable for large amounts of data( nearly 50 elements in each array and about 100000 arrays )?

Comment by M@ Keller [ 16/Jun/11 ]

This morning, actually, I lazily blogged about this before looking in Jira (which is not linked off the mongodb site that I could find, I had to use Google).

http://www.matthewgkeller.com/blog/2011/06/16/dear-mongodb/

Comment by Chris Westin [ 15/Jun/11 ]

The use of $unwind in the new aggregation framework will make it possible to be selective about which of several descendant objects are returned.

Comment by Keith Branton [ 30/Apr/11 ]

Also... will this work if I need one document from one array and one from another array - in the same query - or would I have to do two queries?

Comment by Keith Branton [ 30/Apr/11 ]

Will this be able to return several embedded documents if $in is used in the query? Being able to bring back only one is a huge win over current behavior, but being able to bring back all the ones I want (but only the ones I want) in a single query - now that would be cool!

Comment by Karoly Negyesi [ 28/Dec/10 ]

Yes, this would be very useful because sometimes you have quite big arrays (and with the planned 32MB you will have even bigger) and it's pointless to get all that and convert to a PHP structure.

Comment by Wes [ 28/Apr/10 ]

It would be nice if a flag existed for indicating whether the whole matching branch or just the matching elements would be returned. Example of behavior shown at http://groups.google.com/group/mongodb-user/browse_thread/thread/a51dcffdea389ef4

Comment by Michael Dirolf [ 28/Mar/10 ]

Should think about whether this should just be done w/ virtual collections SERVER-142 - suppose it's possible that you might want to get a field from the top level document and the embedded document, but there is definitely some overlap in use case between these tickets.

Generated at Thu Feb 08 02:55:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.