[SERVER-142] Read-only views over collection data. Created: 10/Jul/09  Updated: 06/Apr/23  Resolved: 31/Aug/16

Status: Closed
Project: Core Server
Component/s: Usability
Affects Version/s: None
Fix Version/s: 3.3.12

Type: New Feature Priority: Major - P3
Reporter: Eliot Horowitz (Inactive) Assignee: Backlog - Storage Execution Team
Resolution: Done Votes: 318
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-283 Push on objects in virtual collection Closed
Documented
is documented by DOCS-10547 Docs for SERVER-142: Read-only views ... Closed
Duplicate
is duplicated by SERVER-10787 Read-only views Closed
is duplicated by SERVER-22733 Need SQL View like capability .. Closed
is duplicated by SERVER-20968 Non-materialized views Closed
Related
related to SERVER-19153 Conditionally push $match before $pro... Closed
related to SERVER-22760 Sharded aggregation pipelines which i... Closed
is related to SERVER-10788 Writable views Backlog
is related to SERVER-828 Support for selecting array elements ... Closed
Assigned Teams:
Storage Execution
Backwards Compatibility: Fully Compatible
Participants:

 Description   
Issue Status as of Aug 31, 2016

ISSUE SUMMARY
MongoDB 3.3.12 adds support for creating read-only views over existing collections or other views. To specify or define a view, MongoDB introduces the viewOn and pipeline options to the existing create command:

db.runCommand( { create: <view>, viewOn: <source>, pipeline: <pipeline> } )

In addition, there's also a mongo shell helper db.createView():

db.createView(<view>, <source>, <pipeline>)

Views are readable via the following commands:

For more information views, as well as examples, please see the Read-only Views documentation.

Original description

Support for read-only views will consist of providing a mechanism for binding a namespace name to a (namespace name, query) pair, where the query might be a MongoDB query or an aggregation expression. For example, if you had a collection "housing.apartments", you might create a view

housing.cheapApartments = (housing.apartments, { rent: { $lte: 1000 } })

Finds on cheapApartments would only consider those elements of housing.apartments where the "rent" field was less than 1000.

By using an aggregation expression with an unwind stage, you could produce a view over a database that had one document for every member of an array in an input document, providing another means to examine and query embedded documents.



 Comments   
Comment by rakesh patil [ 08/May/13 ]

This is my structure

{
_id: "akdjfka",
name : "kjahkfaj",
test : [{
subject: "maths",
score: "80"
comments: [

{name: "kerry", comment : "good score"}

,

{name: "paul", comment: "keep it up"}

],
}
{
subject : "science",
score : "96"
comments:[

{name:"jerry", comment: "awesome "}

,

{name:"pinto", comment: "how do u manage to do that ? "}

]
}

}

I have a different table for each one in mysql but in mongo db i was able to create this sort of architecture and i was not able to filter out only the comment by "kerry"
like may be for example

db.scores.find({_id:"akdjfka",subject:"maths",test.comments.name: "paul"})
i should be able to get

{name: "paul", comment: "keep it up"}

with this feature I will certainly change my database with mongo..

if this is already possible please let me know..

Comment by Harry Mexxian [ 31/Jul/12 ]

+1

Would be nice not to have to worry about embedded docs getting too large, and having to move them back out to a traditional relational schema.

Would it be possible to also have the virtual indices for the virtual collections? So as to not fallback to O when seeking for a specific sub-doc.

Maybe in 2.3? It does have over 200 votes!

Comment by Eliot Horowitz (Inactive) [ 14/Dec/11 ]

I don't think the aggregation framework means this feature won't get done as well.
Different use cases in my mind.

Comment by Colin Mollenhour [ 14/Dec/11 ]

10gen folks, is it safe to say this feature is basically being superseded by the new aggregation framework (SERVER-447)? It seems using $project and $unwind will give the same results being asked for here only differently and with far more flexibility. If that is the case perhaps this ticket should just be closed.

Before anyone gets upset, I am strongly in favor of the idea of virtual collections and would use them heavily in my own projects. I am just seeing that the aggregation framework appears to be well underway (2.1.0, yay!) and provides functionality that overlaps the "virtual collection" concept and so much more. That is, pursuing this ticket would really be a waste of time.. Given the number of votes though it would be nice for everyone to know your intentions.

Using the example in the description of the ticket, it seems a wrapper around the aggregation framework could be written that looks like this:

// equivalent to:
// db.foo.$bar.find(query);
virtualFind("foo","bar",query);
 
function virtualFind(coll, field, query){
  var project = {_id:1};
  project[field] = 1;
  return db.runCommand({ aggregate : coll, pipeline : [
    { $match : query },
    { $project : project },
    { $unwind : "$"+field },
    { $match : query }
  ]});
}
// returns 
[
  { _id: <parent ObjectId>,
    bar: <subdocument>
  },{ _id: <parent ObjectId>,
    bar: <subdocument>
  }
]

This example makes it pretty obvious that the aggregation framework is very flexible and can be bent to fit everyone's needs (parent document fields, filtering, sorting, etc).

Comment by Gregg Carrier [ 02/Dec/11 ]

I could really use this feature in almost every project I build with Mongo. Without it, "schema" designs start to get pretty relational and a lot of the benefits of a document database tend to be lost. Would love to see this as soon as possible.

Comment by Jamie Carl [ 11/Nov/11 ]

I'm new to MongoDB and even though I'm really getting into this embedded document thing as it is, I have already come across a use-case for this. I really wish it was implemented already.

I'm planning on using it to group user accounts. Rather than the relational way of having an account with a reference to a group record, I'd like to have a group document which contains all it's member users (users can only be members of one group). This works as is, but when I try and query a user, I get the whole group document and all the embedded user info, which I don't want.

Comment by Eugen [ 08/Nov/11 ]

ok - i will be more careful in the future )

Comment by Michael Parrish [ 08/Nov/11 ]

@Eugen:
It's worth noting that email notifications are generated every time a comment is added or edited.
Maybe try the preview button below the comment box?

Comment by Eugen [ 08/Nov/11 ]

example collection

 
db.devices:[
{
  _id:834f93874, 
  somefield1:'value1',
  somefield2:'value2',
  data:[
    {
      itemfield1:'value3',
      itemfield2:8 // some number 
    },{
      itemfield1:'value3',
      itemfield2:87 // some number 
    },{
      itemfield1:'value3',
      itemfield2:75 // some number 
    },{
      itemfield1:'value3',
      itemfield2:54 // some number 
    }
  ]
},{
  _id:834f93875, 
  somefield1:'value1',
  somefield2:'value2',
  data:[
    {
      itemfield1:'value3',
      itemfield2:4 // some number 
    },{
      itemfield1:'value3',
      itemfield2:5 // some number 
    },{
      itemfield1:'value3',
      itemfield2:6 // some number 
    },
  ]
}
]
 
// for example i need query which result should be like that:
 
[
{
  _id:834f93874, 
  somefield1:'value1', // selected by value1
  somefield2:'value2',
  data:[
    {
      itemfield1:'value3',
      itemfield2:87 // maximal number from all data items
    }
  ]
},{
  _id:834f93875, 
  somefield1:'value1', // selected by value1
  somefield2:'value3',
  data:[
    {
      itemfield1:'value3',
      itemfield2:6 // maximal number from all data items
    },
  ]
}
]

i think it will be very useful if query will be able to use embeded submethods like this:

query with subquery

db.devices.find( 
  {'somefield1':'value1'}, 
  {
    somefield1:1, 
    somefield2:1, 
    data:find(
     {},
     {itemfield1:1,itemfield2:1}
    ).sort({itemfield2:-1}).limit(1)
  } 
)

because array of objects can be interpreted as embeded collection
so why we cant apply any collection methods to the subcollection (and his subcollections, and...)?
for set 'data' as embeded collection may be needed to specify 'data' as special index, named 'collection', such as

'collection' idex

db.devices.ensureIndex({data:'collection'}) 

for indexing all 'data' members by automaticaly generated _id
i think it is most right way - when all collections, basic or embeded, will be querying by same rules

Comment by Bwon [ 06/Nov/11 ]

It'd be nice, if nothing else, to specify where clauses on embedded documents with large arrays. (Example: a range of comments under a blog post)

Comment by Puneet Chhabra [ 04/Aug/11 ]

Martin, $slice works as long as you know the array index (or range) of elements that you are looking for, not when you want to query elements based on their values.

Comment by Thomas Tucker [ 04/Aug/11 ]

I suggest when you query a virtual collection it should auto return an _id field consisting one of the following
1) The base document _id.
2) The base document _id and any sub document _id fields (for non unique sub document _id fields or unique to only the base document)
3)The sub document _id (if there is a unique index on the sub doc _id field)

This way you can update a sub doc in a virtual collection. and keep the document collection free of unnecessary _id fields

Comment by Martin Lazarov [ 04/Aug/11 ]

$slice works this way - it return only part of embedded array

Comment by Andre [ 04/Mar/11 ]

Ethan, I agree but unfortunately my lame example is semi close but no where near the size of the dataset i am generating.

I can be looking at 300-2000+ small embedded documents inside each "data generation run". Hence pulling back that large of a document (including sub docs) just sort of sucks.

It would be much easier if the server could provide the embedded document back as a result, with bonus #WINNING points for the ability to provide documents before/after (for time/series data sets). I attempted to describe this usecase in the above contrived example.

Comment by Ethan Gunderson [ 26/Jan/11 ]

I agree that virtual collections would be useful, but if you're not finding embedded documents useful without it, I'd have to question what you are embedding...

Comment by Andrei [ 26/Jan/11 ]

I too think that's essential and should be looked upon as a priority goal.
The main benefit that MongoDB has over relational databases (in my opinion), which is "embedding" rather than referencing data, isn't really useful without virtual collections.

Comment by Andre [ 26/Jan/11 ]

This would be extremely valuable if the query could return either the selected value, or a range before/after or before+after the selected value.

This would make things a bit smoother for several use cases i have run into. Almost all have to do with bundling up elements of linear data, and wanting to store the metadata, and produced data in on spot.

A quick extremely contrived example...

db.example.findOne()
{"_id" : ObjectID("<id goes here>"), meta1 :"some meta data", meta2 :"more meta data!", "data": [

{ "x" : 1123121 "ts" : 1238901903 "oid": 12312 "iid" : [22 ,33 , 41] "loc" : 'alleyway' "text" : 'I hate you spot" "action" : "yell spot " } { "x" : 1123122 "ts" : 1238903333 "oid": 12312 "iids" : [22, 33, 41] "loc" : 'street' "text" : 'BREAK YO SELF!!!" "action" : 'take spot.girl" }

.... (many other such documents would go into this one document)

It would be nice to simply return the matching element. It would be even better if i could specify a positive or negative range around the matches to grab the "before/after" or just the "before" or just the "after" on each match.

something like

db.actions.find({meta1: 'some meta data', "data.action" : $filter{ -2,+2

{ "loc" : "street" }

}})

Where $filter can be what ever the function ends up being named, and the above query would give me 2 elements before, and the two elements after each match.

Comment by alex [ 03/Jan/11 ]

Please implement this feature. It's very essential for databases.

Comment by Harald Lapp [ 29/Dec/10 ]

i would really love to have this feature and from the votes it seems, that it's the second most wanted feature for mongodb. as there seems to be no progress on this i wonder, if it is the technical complexity which keeps it from being implemented ... ?

Comment by Eliot Horowitz (Inactive) [ 24/Aug/10 ]

No fear - $push/$pull definitely will not be going anywhere.

Comment by Daniel Friesen [ 24/Aug/10 ]

Seconded, don't remove $push/$pull. There are cases where you may want to $push/$pull and also do something like $set something in another part of the doc. Replacing them with virtual collections would mean you would be force to make two separate non-atomic changes. (it would also make for a horrible experience for non-js drivers).

Comment by Wouter [ 24/Jul/09 ]

I don't see how virtual collections would remove the need for $push/$pull.

Think about tagging for example:

db.articles.update( {_id:1}, { $push : { tags : 'database' }} );

Please don't remove $push/$pull

Generated at Thu Feb 08 02:53:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.