[CSHARP-283] Ambiguity in MongoDB.Driver.Builders.Update.AddToSetEach methods Created: 25/Jul/11  Updated: 02/Apr/15  Resolved: 05/Aug/11

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: 1.1
Fix Version/s: 1.2

Type: Improvement Priority: Minor - P4
Reporter: Roman Kuzmin Assignee: Robert Stam
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

Visual Studio 2010, Target Framework .NET 3.5 (driver 1.1 - both official API and built from the latest sources).

This code cannot be compiled:

void TestCompilerError()

{ var bsonArray = new MongoDB.Bson.BsonArray(); var update = MongoDB.Driver.Builders.Update.AddToSetEach("Name", bsonArray); }

Error: The call is ambiguous between the following methods or properties: 'MongoDB.Driver.Builders.Update.AddToSetEach(string, System.Collections.Generic.IEnumerable<MongoDB.Bson.BsonValue>)' and 'MongoDB.Driver.Builders.Update.AddToSetEach(string, params MongoDB.Bson.BsonValue[])'

The couple of methods were apparently added for convenience in some cases. But it also makes problems in other cases. Perhaps the issue was known and the choice was done. If not, perhaps it makes sense to review and revise this API.

(N.B. Just in case, this case of ambiguity is different from https://jira.mongodb.org/browse/CSHARP-280)



 Comments   
Comment by Robert Stam [ 05/Aug/11 ]

Added overloads for BsonArray to resolve ambiguity of BsonArray argument values with IEnumerable<BsonValue> and params BsonValue[] overloads.

Comment by Aristarkh Zagorodnikov [ 01/Aug/11 ]

I would also like to note, that adding the overload would be better for performance reasons too, since current mechanism is creating another BsonArray internally from the passed IEnumerable<BsonValue>.

Comment by Aristarkh Zagorodnikov [ 01/Aug/11 ]

Before:
var cursor = collection.Find(Query.In("_id", BsonArray.Create(ids)));
After:
var cursor = collection.Find(Query.In("_id", (IEnumerable<BsonValue>)BsonArray.Create(ids)));

BsonArray overload would be very welcome, since it will remove unnecessary clunkiness.
As for "params" – it might be useful to leave them as they are.

Comment by Robert Stam [ 29/Jul/11 ]

No problem. I think it has been a helpful discussion.

Not sure how many people are reading this. Anyone else want to add their opinion on whether the params overloads are helpful or not?

Comment by Roman Kuzmin [ 29/Jul/11 ]

Your proposal of custom overloads with BsonArray as a parameter sounds good to me. At least this way should be considered. As for `params` overloads... you decide .

Thank you for your patience and will to discuss and think of all that.

Comment by Roman Kuzmin [ 29/Jul/11 ]

> Your proposal to remove the params BsonValue[] overloads would result in the BsonArray being interpreted as IEnumerable<BsonValue> (because that would the only overload left).
This is not a problem. This is going to be almost always what we actually want to do. And if we want an array to be treated as a single value then we just call our X(IEnumerable<BsonValue>) like:
X(new BsonValue[]

{ array }

);

It seems easy. And not much worse than this (besides, a user at first has to get a compiler error and find out this way with .AsBsonValue):
X(array.AsBsonValue);

> You seem to be suggesting that most commonly the BsonArray would be used in the same way as a List<BsonValue>.
Yes, that is what I foresee. Data comes from the database as BsonArray. Then it is also effective in many cases to deal with BsonArray internally in order to avoid excessive conversions/copies/etc.

This particular ticket is a minor issue. But it raises some questions about alternative API approaches. But whatever approach we choose it should be consistent. If we use `params` in some methods then users may expect such overloads everywhere in similar cases. And if they do not find them then they get disappointed. Thus, introducing `params` as a concept is a burden. I would not do that until I am really requested and convinced in practical benefits. Especially in cases where such overloads introduces ambiguity in API.

Just a side note, it does not prove or disprove anything but it is some experience. For 5+ years I am an owner of the open source project FarNet: http://code.google.com/p/farnet/. The project is all about scrupulous .NET API for a file manager with rich UI and tons of features. Normally I do not add `params` without really good reasons. I have just counted only 3 methods with `params` out of 2000+ other methods (raw estimation). And for all that time no user ever asked me to add any overload with `params` notation.

Comment by Robert Stam [ 29/Jul/11 ]

Your proposal to remove the params BsonValue[] overloads would result in the BsonArray being interpreted as IEnumerable<BsonValue> (because that would the only overload left).

We could resolve the ambiguity without removing the params BsonValue[] overloads and achieve the same result by adding overloads for BsonArray that do whatever we want them to do (interpret it as either a BsonValue or a list of BsonValues).

You seem to be suggesting that most commonly the BsonArray would be used in the same way as a List<BsonValue>. So we could choose to make that the default interpretation.

Adding custom overloads for BsonArray would result in:

1. No compile time error (and no WTF emotions...)
2. Default interpretation will be correct most of the time (if we choose default correctly)
3. A cast is only required when the opposite interpretation is desired

Comment by Roman Kuzmin [ 29/Jul/11 ]

Replacement of X(BsonArray) with X(IEnumerable<BsonValue>) is a good move in any case. It lets to avoid redundant conversions from IEnumerable<BsonValue> to BsonArray and it definitely reduces chances of ambiguity in cases when a user has IEnumerable<BsonValue> other than BsonArray. So that it is a good change.

But I afraid that cases with BsonArray are not quite rare and such an ambiguous call is still going to happen. Even for an experienced user who knows how to resolve this (casting or use of array.Values) it might be not pleasant (with WTF emotions sometimes, for sure). And for a not experienced user it can be really puzzling.

System.Linq.Expressions.Expression is a perfect example of wise and cleaver use of params overloads. In the realm of building expression manually numbers of arguments are often known beforehand. Thus, without params overloads code like X(..., new Y[]

{ a, b, c }

) would be frequent and inconvenient. Params overloads improve ease of coding significantly. Besides, in their case params overloads do not introduce ambiguity like in our case. This approach has only pluses.

Our case is quite different. Calls like X("Monday", "Friday") can normally be found only in test and demo code. In real applications those "Monday", "Friday" are never hardcoded. Moreover, their number is not known beforehand (why not to add "Sunday"?).

The question:
Should our X(..., params BsonArray[]) be removed or should they stay?

The answer would be simple if we can practically weigh these two inconveniences:
1 Without params: necessity to write X(..., new BsonValue[]

{ a, b }

) in some cases when a and b are known beforehand;
2 With params: having BsonArray necessity to write X((IEnumerable<BsonValues>)array) or X(array.Values) (and necessity to be aware of this issue and ways to resolve).

But without proper investigation and statistics of use cases the answer can be only subjective. From my experience and intuitive evaluation of our domain 1 is better than 2. In particular because BsonArray as a collection to deal with is not going to be rare at all, it is the basic and the only native driver collection of BsonValues.

""""" CAUTION """""

Having said that, I have to add the following important note (always keeping PowerShell in mind). With new X(IEnumerable<BsonValue>) instead of X(BsonArray) we should keep another X(params BsonValue[]) overload. The latter is now the only way to call the X() relatively easy from PowerShell. X(IEnumerable<BsonValue>) can be called too but not so easy.

Comment by Robert Stam [ 29/Jul/11 ]

This JIRA ticket will remain open a few days to allow sufficient time for further comments.

Comment by Robert Stam [ 29/Jul/11 ]

Pushed some changes that potentially resolve this JIRA ticket to github, although the changes don't affect AddToSetEach.

Proposed resolution: the ambiguity between overloads is intentional, reflecting the inherent dual nature of a BsonArray (it is both a BsonValue and an IEnumerable<BsonValue>). The user can either use a cast to resolve the ambiguity, or use a different data type like List<BsonValue> or BsonValue[] which doesn't have the ambiguity. Also changed Query.All, .In and .NotIn to match AddToSetEach (might break some code but easily fixed).

Sample code:

BsonArray bsonArray;
Update.AddToSetEach("name", bsonArray); // compiler error: ambiguous, matches two overloads
Update.AddToSetEach("name", (BsonValue) bsonArray); // treat bsonArray as a single BsonValue
Update.AddToSetEach("name", (IEnumerable<BsonValue>) bsonArray); // treat bsonArray as a list of BsonValues

Similar code that doesn't use casts directly but has the same effect is:

Update.AddToSetEach("name", bsonArray.AsBsonValue); // treat bsonArray as a single BsonValue
Update.AddToSetEach("name", bsonArray,Values); // treat bsonArray as a list of BsonValues

An even better solution is to not use BsonArray as a data type. For example:

List<BsonValue> values;
Update.AddToSetEach("name", values); // no ambiguity

Comment by Robert Stam [ 28/Jul/11 ]

As a general rule I don't think there is anything wrong with having both of the following overloads:

public void F(IEnumerable<X> values);
public void F(params X[] values);

They usually coexist very happily. While it is true that this combination occurs very rarely in the Framework APIs, I did find at least one class (System.Linq.Expressions.Expression) that has 12 methods that have a similar combination of overloads.

The only reason that we are having a problem is that we have a subclass Y of X that also implements IEnumerable<X>. So we definitely have an issue to address in some yet to be determined way.

I don't understand why you dismiss the the params form of AddToSetEach so lightly. I think there are many use cases where I would want to take advantage of that overload. Some examples:

Update.AddToSetEach("friends", "Tom", "Dick", "Harry")
Update.AddToSetEach('holidays", 3, 15)
Update.AddToSetEach("open", "Monday", "Wednesday", "Thursday")
Update.AddToSetEach("availableColors", "red", "blue")

And even if you don't think it's useful in the AddToSetEach case, there are other places where it seems likely to be used. Some examples:

Query.All("open", "Monday", "Tuesday")
Query.In("status", "pending", "rescheduled")

So whatever action we take needs to cover all places where a "list" of BsonValues is provided.

We could even leave the ambiguity in and let the user tell us what they mean by providing a cast one way or the other. Keep in mind that this ambiguity only occurs in the rather unusual case where the variable is defined as a BsonArray. In the more likely case where the variable is either BsonValue[] or List<BsonValue> no ambiguity occurs.

Comment by Roman Kuzmin [ 28/Jul/11 ]

As for the first example provided, with AddToSetEach(params T[]) it is not only compiler puzzled, me too. Looking at the code:
var update1 = Update.AddToSetEach("x", array);

.. I do not really know what user wants to do (to add each item of the array or to add the array as a single item).

Without AddToSetEach(params T[]) the answer is clear – to add each item of the array. Otherwise, we should simply use AddToSet:
var update1 = Update.AddToSet("x", array);

The first example is typical in practice and it is going to be perfectly resolved by removing AddToSetEach(params T[]).

As for the second example without AddToSetEach(params T[]) we should be slightly more verbose, than now:
var update1 = Update.AddToSetEach("x", new BsonValue[]

{ array1, array2 }

);

But as it was said in the previous post, the second example is not typical, it is rare. The first typical example should be prioritized. The first example should be easy to use, not the second.

Comment by Roman Kuzmin [ 28/Jul/11 ]

I would remove the overload AddToSetEach(string, params MongoDB.Bson.BsonValue[]) completely. From my experience in such cases it does not add much value. It should be used sparingly when it is practically proven to be useful). Compare with other well designed APIs (.NET framework itself) – if we have a X(IEnumerable<T>) in a class then how often do we have the X(params T[]) pair in addition? Not often at all.

At first it looks tempting to be able to do AddToSetEach("name", value1, value2) instead of AddToSetEach("name", new BsonValue[]

{ value1, value2 }

). But the point is that this is not a real life scenario. It might happen in practice, yes, but so rare that the second notation does not actually hurt much. The real life scenario for AddToSetEach is that there is already some IEnumerable<T> at hands or we have to collect items to be processed:

var list = new List<BsonValue>() // or even BsonArray that makes issues now
for/foreach(...)

{ ... list.Add(value) ... }

// collect items for AddToSetEach
AddToSetEach("name", list) // even if list contains 1, 2, few items – we do not know that, so we do not use X(params T[]) version

So that in practice we would rare use X(params T[]) facility anyway. That is what I think. I can be wrong though. Other users may have other real life scenarios and may want X(params T[]) to stay. In that case (it is really the case) I would introduce a method a different name. It is not perfect but it is much better than introducing this ambiguity.

Summary:
Either remove overloads X(params T[]) completely (I would go this way) or give them different names (still remove overloads but let such method to stay with different names).

P.S. Presumably there are other cases like this. AddToSetEach is only an example I stumbled over.

Comment by Robert Stam [ 27/Jul/11 ]

Here's one way to look at this. Give the following two statements:

var update1 = Update.AddToSetEach("x", array);
var update2 = Update.AddToSetEach("x", array, array);

The first is ambiguous but the second compiles without error and adds two new values to the set (the values happen to be arrays).

We could just decide that we want to resolve the ambiguity by saying that the first statement should be treated the same as the second. We can achieve this by having a special overload for BsonArray that resolves the ambiguity and is implemented so that it does what we want it to do.

Any comments?

Comment by Robert Stam [ 27/Jul/11 ]

The ambiguity arises because BsonArray both:

1. Derives from BsonValue
2. Implements IEnumerable<BsonValue>

In other words, it can either be thought of as a single BsonValue or as a list of BsonValues.

It's almost a good thing that this ambiguity arises. We need the user to tell us whether they want:

1. To add a single value that happens to be of type BsonArray
2. Add each of the values of the BsonArray to the set individually

They will have to use a cast to tell us which they want:

1. Update.AddToSetEach("Name", (BsonValue) bsonArray)
2. Update.AddToSetEach("Name", (IEnumerable<BsonValue>) bsonArray)

Do you think there is any way to not have this ambiguity? Of course, if they use a regular array instead of a BsonArray I don't think the ambiguity exists.

Generated at Wed Feb 07 21:36:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.