[CSHARP-497] chaining .Take() operators doesn Created: 15/Jun/12  Updated: 20/Jul/12  Resolved: 19/Jul/12

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: 1.4.2
Fix Version/s: 1.6

Type: Improvement Priority: Major - P3
Reporter: Igor Udovichenko Assignee: Craig Wilson
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Comments   
Comment by auto [ 16/Jul/12 ]

Author:

{u'date': u'2012-07-16T08:39:13-07:00', u'email': u'craiggwilson@gmail.com', u'name': u'Craig Wilson'}

Message: CSHARP-497: added support to not execute a query that has specified a limit value of 0, indicating no results should come back.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/53eca4f72c73ccac83792791d94ee43d83d5047b

Comment by auto [ 16/Jul/12 ]

Author:

{u'date': u'2012-07-16T08:27:17-07:00', u'email': u'craiggwilson@gmail.com', u'name': u'Craig Wilson'}

Message: CSHARP-497: amended expectations to not throw when skip or take is specified above the current limit.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/1c25c65c64c86316e456a39e1025b513f47d2e08

Comment by auto [ 16/Jul/12 ]

Author:

{u'date': u'2012-07-16T07:16:08-07:00', u'email': u'craiggwilson@gmail.com', u'name': u'Craig Wilson'}

Message: CSHARP-497: added support for converting multiple chained skip and take operators.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/441e639ea35880604205d5e4ceea38c2ea6aded5

Comment by Anton Samarskyy [ 17/Jun/12 ]

@Craig,
Thanks for extensive explanation of the issue. We will try to make some contributions to the project.

Comment by Craig Wilson [ 16/Jun/12 ]

@Anton and @Igor:
I don't believe that any implementation of LINQ, be it NHibernate, Entity Framework, Linq2Sql, etc... actually support PI. The problem with IQueryable is that it is inherently a leaky abstraction. Anytime one of these frameworks fails to support any operator means it has fully failed to provide persistence ignorance. The only real question is how much is required to support a majority of the needs. You indicate that the need here is in relation to OData. However, that is probably not most people's need and so we must weigh your need and the complexity associated with it against the tickets that are just as important to others. We do accept contributions as we are an open source project so if you find an issue you would like to fix, then we would love your help and expertise.

In relation to this issue, as Robert is indicating, we can certainly do more than we are now as we apparently have a bug. It is tentatively scheduled to get fixed in 1.6, which is likely some months away as we are preparing now to release 1.5. Even in that amount of time, there is a lot to do and we are limited by both our time and the abilities of the mongo query language.

Finally, supporting certain things from Linq is either impossible or would be highly improper. For instance, we don't support joins because MongoDB does not support joins. This, by all definitions, means we have failed PI. We could support joins client-side, but the performance of the driver would suffer immeasurably. Therefore, we will not be supporting joins and any OData queries that generate a join will fail. I hope this is understandable and acceptable.

Comment by Anton Samarskyy [ 16/Jun/12 ]

What is the purpose of saying that Mongo Driver supports LINQ without persistence ignorance? Since we still need to create extra-layer decorating such a behavior. Is there any way to "say" Mongo Driver that collection.AsQueryable().Take(10).Skip(20).Take(50) should return the same as Linq-to-Object returns? Main concert highly related to WebApi+OData that becomes more and more popular these days:
http://www.asp.net/web-api/overview/web-api-routing-and-actions/paging-and-querying

I understand it is hard to support really complicated chains of duplicated expressions, but all I need is Mongo driver to query WebApi+OData service request with same result as Linq-To-Objects would have, and be optimal
Does it make sense?

Comment by Robert Stam [ 15/Jun/12 ]

All I'm saying is that the LINQ semantics for chained sequences of Take/Skip can't always be mapped to an equivalent MongoDB query. I'm not saying we should do the wrong thing!

For example, maybe we can convert .Take(20).Skip(5) into .Skip(5).Take(15)? But how many combinations can we manipulate this way and will we get it wrong sometimes?

That's why I said: "If it gets too complicated I would vote for 1) instead, and throw an exception."

At the very least we should throw an exception if we can't support the Skip and Take combinations encountered.

We know that Skip followed by Take is directly supported by the MongoDB query language, so the question is how to handle all other combinations

Comment by Igor Udovichenko [ 15/Jun/12 ]

Interesting. Seems weird to me - fails persistence ignorance here

Comment by Robert Stam [ 15/Jun/12 ]

In LINQ to objects, yes.

But in the MongoDB query language the Skip is always done before the Take. So for example, we get the same response from the server whether we specify the limit or the skip first:

> db.test.find().limit(20).skip(5)
{ "_id" : ObjectId("4fdbc0b784f8a93d3b5ea520"), "x" : 6 }
{ "_id" : ObjectId("4fdbc0bb84f8a93d3b5ea521"), "x" : 7 }
{ "_id" : ObjectId("4fdbc0bd84f8a93d3b5ea522"), "x" : 8 }
{ "_id" : ObjectId("4fdbc0bf84f8a93d3b5ea523"), "x" : 9 }
{ "_id" : ObjectId("4fdbc0c284f8a93d3b5ea524"), "x" : 10 }
{ "_id" : ObjectId("4fdbc0c484f8a93d3b5ea525"), "x" : 11 }
{ "_id" : ObjectId("4fdbc0c584f8a93d3b5ea526"), "x" : 12 }
{ "_id" : ObjectId("4fdbc0c884f8a93d3b5ea527"), "x" : 13 }
{ "_id" : ObjectId("4fdbc0ca84f8a93d3b5ea528"), "x" : 14 }
{ "_id" : ObjectId("4fdbc19184f8a93d3b5ea529"), "x" : 15 }
{ "_id" : ObjectId("4fdbc19484f8a93d3b5ea52a"), "x" : 16 }
{ "_id" : ObjectId("4fdbc19884f8a93d3b5ea52b"), "x" : 17 }
{ "_id" : ObjectId("4fdbc19b84f8a93d3b5ea52c"), "x" : 18 }
{ "_id" : ObjectId("4fdbc19d84f8a93d3b5ea52d"), "x" : 19 }
{ "_id" : ObjectId("4fdbc1a284f8a93d3b5ea52e"), "x" : 20 }
{ "_id" : ObjectId("4fdbc1a484f8a93d3b5ea52f"), "x" : 21 }
{ "_id" : ObjectId("4fdbc1a684f8a93d3b5ea530"), "x" : 22 }
{ "_id" : ObjectId("4fdbc1a984f8a93d3b5ea531"), "x" : 23 }
{ "_id" : ObjectId("4fdbc1ab84f8a93d3b5ea532"), "x" : 24 }
{ "_id" : ObjectId("4fdbc1ae84f8a93d3b5ea533"), "x" : 25 }
> db.test.find().skip(5).limit(20)
{ "_id" : ObjectId("4fdbc0b784f8a93d3b5ea520"), "x" : 6 }
{ "_id" : ObjectId("4fdbc0bb84f8a93d3b5ea521"), "x" : 7 }
{ "_id" : ObjectId("4fdbc0bd84f8a93d3b5ea522"), "x" : 8 }
{ "_id" : ObjectId("4fdbc0bf84f8a93d3b5ea523"), "x" : 9 }
{ "_id" : ObjectId("4fdbc0c284f8a93d3b5ea524"), "x" : 10 }
{ "_id" : ObjectId("4fdbc0c484f8a93d3b5ea525"), "x" : 11 }
{ "_id" : ObjectId("4fdbc0c584f8a93d3b5ea526"), "x" : 12 }
{ "_id" : ObjectId("4fdbc0c884f8a93d3b5ea527"), "x" : 13 }
{ "_id" : ObjectId("4fdbc0ca84f8a93d3b5ea528"), "x" : 14 }
{ "_id" : ObjectId("4fdbc19184f8a93d3b5ea529"), "x" : 15 }
{ "_id" : ObjectId("4fdbc19484f8a93d3b5ea52a"), "x" : 16 }
{ "_id" : ObjectId("4fdbc19884f8a93d3b5ea52b"), "x" : 17 }
{ "_id" : ObjectId("4fdbc19b84f8a93d3b5ea52c"), "x" : 18 }
{ "_id" : ObjectId("4fdbc19d84f8a93d3b5ea52d"), "x" : 19 }
{ "_id" : ObjectId("4fdbc1a284f8a93d3b5ea52e"), "x" : 20 }
{ "_id" : ObjectId("4fdbc1a484f8a93d3b5ea52f"), "x" : 21 }
{ "_id" : ObjectId("4fdbc1a684f8a93d3b5ea530"), "x" : 22 }
{ "_id" : ObjectId("4fdbc1a984f8a93d3b5ea531"), "x" : 23 }
{ "_id" : ObjectId("4fdbc1ab84f8a93d3b5ea532"), "x" : 24 }
{ "_id" : ObjectId("4fdbc1ae84f8a93d3b5ea533"), "x" : 25 }
>

Comment by Igor Udovichenko [ 15/Jun/12 ]

Regarding skip and take
Take(20).skip(5) - and this should return 6-20, isn't it?

Comment by Robert Stam [ 15/Jun/12 ]

Regarding Skip and Take, we probably should require that Skip be before Take, since that's the only combination of Skip and Take the server supports.

I don't really have a problem limiting Skip and Take to combinations that are valid for MongoDB queries. Everything in LINQ to MongoDB is basically limited by what the MongoDB query language supports.

Comment by Robert Stam [ 15/Jun/12 ]

I think LINQ to objects doesn't combine the Takes in any way. As data flows through the pipeline all the Takes are executed, with the net result that the smallest Take ends up determining the final result. For example, the output for:

var source = new int[] { 1, 2, 3, 4, 5, 6, 7, 8 };
Console.WriteLine("Take 2 then 4: {0}", string.Join(", ", source.Take(2).Take(4).Select(n => n.ToString())));
Console.WriteLine("Take 4 then 2: {0}", string.Join(", ", source.Take(4).Take(2).Select(n => n.ToString())));

is

Take 2 then 4: 1, 2
Take 4 then 2: 1, 2

Comment by Craig Wilson [ 15/Jun/12 ]

Throwing an exception would cause our reporter to not work either. However, what we do is already different than what linq to objects does. For instance, we allow Take(20).Skip(20) and send those over the wire as skip() and limit(), not taking into account the order. So, while we would return items 20-40 from mongodb, linq to objects would return 0. So taking just the lowest number is probably just fine since we are already a little off. Perhaps we should enforce that skip cannot come after a take?

Comment by Igor Udovichenko [ 15/Jun/12 ]

Yes, I prefer number 2, not only because linq2objects resolved chained Takes this way, but because collection.Where.Where has similar behavior.

Comment by Robert Stam [ 15/Jun/12 ]

If it gets too complicated I would vote for 1) instead, and throw an exception.

Comment by Craig Wilson [ 15/Jun/12 ]

I like that. We need to be careful about stuff in the middle... Take(40).Skip(20).Take(20) is not the same as Take(20).Skip(20).

Comment by Robert Stam [ 15/Jun/12 ]

Wouldn't chaining multiple Takes in LINQ to objects result in the smallest
number being returned?

If so, perhaps the right fix is:

4) Use the smallest of the multiple Takes

Comment by Craig Wilson [ 15/Jun/12 ]

Also sounds like a bug in whatever client is generating your OData query... Regardless, we need to fix this on our side as well.

These are the options, which one makes the most sense.

1) Throw an exception if multiple takes are specified
2) Use only the first Take
3) Use the last Take <-- this is what is currently being done

I assume you want us to use #2, which is what linq to objects does, not because it checks for this, but rather because it just happens that way.

Comment by Igor Udovichenko [ 15/Jun/12 ]

Sure, it was posted by an accident.
LINQ driver does not support chaining .Take() operations:
MongoCollection collection;
....
(for example, collection contains 20 elements in base)

collection.AsQueryable().Take(1).Take(50) - first Take is ignored, and resulting collection contains 20 elements, instead of one
which leads to wrong behavior in C# web api:
http://stackoverflow.com/questions/11005686/odata-top-doesnt-work-with-mongodb

Comment by Craig Wilson [ 15/Jun/12 ]

Igor, could you complete your statement title and add a description? I'm not sure what you are reporting...

Generated at Wed Feb 07 21:37:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.