[CSHARP-893] MongoDB.Bson.Serialization.Serializers.DictionarySerializer do not support empty string as key when CheckElementNames is true Created: 17/Jan/14  Updated: 02/Apr/16  Resolved: 28/Jan/14

Status: Closed
Project: C# Driver
Component/s: Serialization
Affects Version/s: 1.8.3
Fix Version/s: 1.9

Type: Bug Priority: Blocker - P1
Reporter: Blake Niemyjski Assignee: Robert Stam
Resolution: Done Votes: 1
Labels: driver
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by CSHARP-880 System.IndexOutOfRange Exception in C... Closed
Related
related to SERVER-2051 Disallow empty string keys Open
Backwards Compatibility: Minor Change

 Description   

We have been running into a pretty bad bug where we noticed some objects were failing to be inserted into mongo. We were getting the following stack trace:

System.IndexOutOfRangeException
Index was outside the bounds of the array.
at MongoDB.Bson.IO.BsonWriter.CheckElementName(String name)
at MongoDB.Bson.IO.BsonWriter.WriteName(String name)
at MongoDB.Bson.Serialization.Serializers.DictionarySerializer`2.Serialize(BsonWriter bsonWriter, Type nominalType, Object value, IBsonSerializationOptions options)
at MongoDB.Bson.Serialization.BsonClassMapSerializer.SerializeMember(BsonWriter bsonWriter, Object obj, BsonMemberMap memberMap)
at MongoDB.Bson.Serialization.BsonClassMapSerializer.Serialize(BsonWriter bsonWriter, Type nominalType, Object value, IBsonSerializationOptions options)
at MongoDB.Bson.Serialization.BsonClassMapSerializer.SerializeMember(BsonWriter bsonWriter, Object obj, BsonMemberMap memberMap)
at MongoDB.Bson.Serialization.BsonClassMapSerializer.Serialize(BsonWriter bsonWriter, Type nominalType, Object value, IBsonSerializationOptions options)
at MongoDB.Driver.Internal.MongoInsertMessage.AddDocument(BsonBuffer buffer, Type nominalType, Object document)
at MongoDB.Driver.Operations.InsertOperation.Execute(MongoConnection connection)
at MongoDB.Driver.MongoCollection.InsertBatch(Type nominalType, IEnumerable documents, MongoInsertOptions options)
at MongoDB.Driver.MongoCollection.Insert(Type nominalType, Object document, MongoInsertOptions options)

When researching this issue we came across this: https://jira.mongodb.org/browse/CSHARP-624, which seemed like it was the issue but we were still experiencing it...

Upon looking at the code... We tracked this down to the CheckElementName name is not checking for an empty string (just null), it's just trying to access the first character (which there are none). The following unit test reproduces this issue in the smallest amount of code possible.

[Test]
public void TestEmptyDictionaryKeySerialization() {
var dictionary = new Dictionary<string, string>() {

{ "k", "v" }

,

{ "", "emptyKey" }

};

using (var buffer = new BsonBuffer()) {
using (var bsonWriter = BsonWriter.Create(buffer, new BsonBinaryWriterSettings()))

{ bsonWriter.CheckElementNames = true; var serializer = new Bson.Serialization.Serializers.DictionarySerializer(); serializer.Serialize(bsonWriter, dictionary.GetType(), dictionary, serializer.DefaultSerializationOptions); }

}
}



 Comments   
Comment by Craig Wilson [ 19/Feb/14 ]

There are no intended breaking changes. BTW: We've released rc0 as of today...

Comment by Blake Niemyjski [ 19/Feb/14 ]

Thanks! Are there any breaking changes in 1.9?

Comment by Robert Stam [ 12/Feb/14 ]

We are planning a beta or RC0 within a week or so. The final version of 1.9 will ship at around the same time as server 2.6.

Comment by Blake Niemyjski [ 12/Feb/14 ]

Do you have any idea when 1.9 will ship a beta or rc or rtm? We have ~14,500 occurrences for this bug logged in our system as of right now.

Comment by Githook User [ 28/Jan/14 ]

Author:

{u'username': u'rstam', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: CSHARP-893: Code review changes.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/f58e6f679ba14a86b4cd360fb3a323556d818e55

Comment by Githook User [ 28/Jan/14 ]

Author:

{u'username': u'rstam', u'name': u'rstam', u'email': u'robert@10gen.com'}

Message: CSHARP-893: Fix handling of empty string element names and dictionary key values.
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/ea48045a3675242803b754e7bd3ef83ed2330562

Comment by Craig Wilson [ 27/Jan/14 ]

Yes. Beta or RC.

Comment by Blake Niemyjski [ 27/Jan/14 ]

Will you guys have a beta build of 1.9 on nuget?

Comment by Blake Niemyjski [ 20/Jan/14 ]

Thanks for the clarification. We will keep that in mind and try to add checks where possible.

Comment by Robert Stam [ 20/Jan/14 ]

You shouldn't have to change your application. If we decide that empty strings are not valid element names the driver will automatically switch to ArrayOfArrays representation if any of the key values is an empty string.

The only way you would be affected by this is if you index your dictionary elements or write queries against them. In that case you will want to make sure that your application configures the serialization so as to get a consistent representation in the database.

Note that while the server currently does allow empty strings as element names, it's not certain that will always be case in the future. See:

https://jira.mongodb.org/browse/SERVER-2051

Comment by Blake Niemyjski [ 20/Jan/14 ]

We already have gb's of data in our database. I'm not sure how much of a pita it would be to change the representation now or the tons of places we'd have to change it in. I'd rather just wait/help contrib a fix.

Comment by Robert Stam [ 20/Jan/14 ]

Dictionaries can be serialized in several ways. This is controlled by the DictionaryRepresentation enum:

    public enum DictionaryRepresentation
    {
        Dynamic,
        Document,
        ArrayOfArrays,
        ArrayOfDocuments
    }

The default representation (probably the one preferred by most users) is Document:

// DictionaryRepresentation Document
{ dictionary: { key1: value1, key2: value2, ... }, ... }

This representation is only possible when all the keys are strings, and the values of the strings are legal element names. Whether an empty string is a legal element name is somewhat ambiguous, but it's probably safer to assume that it is not.

There are two alternative representations:

// DictionaryRepresentation ArrayOfArrays
{ dictionary: [[key1, value1], [key2, value2], ... ], ... }

and

// DictionaryRepresentation ArrayOfDocuments
{ dictionary: [{ k: key1, v: value1 }, { k: key2, v: value2 }, ... ], ... }

The ArrayOfArrays representation was the first representation we used for keys that are not valid element names. But the ArrayOfDocuments representation is sometimes better when you want to index by just the key or just the value.

The DictionaryRepresentation Dynamic means look at the values of the keys and choose the Document representation if all the keys are valid element names, and choose the ArrayOfArrays representation if not.

I think there are two issues to address here:

1. Decide if "" is a legal element name or not, and make sure DictionarySerializer handles keys equal to "" accordingly
2. CheckElementName should either allow "" as valid or throw a more appropriate exception if not

Thanks for reporting this.

Comment by Blake Niemyjski [ 20/Jan/14 ]

Please let me know if there is anything I can do..

Comment by Craig Wilson [ 20/Jan/14 ]

As you stated, they are legal everywhere even if we don't recommend them. My colleague and I need to discuss the right direction here, but we'll likely fix this as a bug.

Comment by Blake Niemyjski [ 20/Jan/14 ]

What do you think? The previous issue (https://jira.mongodb.org/browse/CSHARP-624) says it's supported but in this scenario it's not. I'd rather have the issues of a dealing with an empty key in a collection then having collection.add(object) fail.

Comment by Blake Niemyjski [ 19/Jan/14 ]

I think it should play nicely with them. It's not always easy to detect
them before inserting and if you have them in your db it's your issue there
there. If .net supports empty keys and mongo does as well so should the
client


Blake Niemyjski
Software Development Engineer
CodeSmith Tools, LLC
http://www.codesmithtools.com

Comment by Craig Wilson [ 18/Jan/14 ]

Hi Blake,
Thanks for the test reproduction. As a follow-up question, what are you expecting to happen when we encounter an empty string? Are you expecting it to be valid and should be let through, or should we still throw an exception, just a better exception than IndexOutOfRange. Off the top of my head, MongoDB doesn't really play nice with empty keys like this...

Generated at Wed Feb 07 21:38:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.