[CSHARP-765] DOCS: Convention for ignoring all empty collection types in serialization? Created: 27/Jun/13  Updated: 27/Feb/18  Resolved: 02/Feb/18

Status: Closed
Project: C# Driver
Component/s: Serialization
Affects Version/s: 1.8.1
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Curt Mayers Assignee: Robert Stam
Resolution: Done Votes: 2
Labels: c#, collections, driver, lists, serialization, serializers
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows


Issue Links:
Related
related to CSHARP-767 Add an overload of SetDefaultValue th... Closed

 Description   

Serializing complex documents that have many optional lists results in the explicit serialization of an empty collection (i.e. []), which is a waste of space, and makes indexes unnecessarily large and clumpy.

I have littered my classes with a bunch of "ShouldSerializeXXX" methods, which does resolve the issue, at the cost of some additional typing. But I think it would be useful to implement this as a convention. That way, it could be established as the default, and would provide a more robust way of implementing this option globally (for all collection or IEnumerable types).

It may be possible to do currently, but I haven't figured out how to do it. I suspect that this option might by widely useful.



 Comments   
Comment by Frédéric Barrière [ 27/Feb/18 ]

I think that this issue should not be closed.
For me it does not work as expected.
It works fine when a full document is retrieved from collection, but it does not work when we want to retrieve a partial document by doing a projection.
Example :
Let's configure a document mapping to not serialize a property when it is empty, and set a default (empty) value when it is not present in document. Here is the property Comments of a Photo document :

cm
  .MapMember(x => x.Comments)
  .SetDefaultValue(() =>
  {
    return new Comment[0];
  })
  .SetShouldSerializeMethod(x => ((Photo)x).Comments.Length > 0);

If we retrieve a document serialized without the field Comments (because it was empty), all is fine. The instance in C# has a empty array for the Comments property (the factory for default value was well called).

Photo photo = await this.collection
  .Find(x => x.Id == id)
  .SingleOrDefaultAsync(cancellationToken);
int count = photo.Comments.Length;   // 0 : OK

But if we retrieve a part of this document, then the default value is not used (the factory for default value was not called) and null is set to the Comments property (I added a breakpoint in the setter of the property) :

var photo = await this.collection
  .Find(x => x.Id == id)
  .Project(x => new { Id = x.Id, Comments = x.Comments }) 
  .SingleOrDefaultAsync(cancellationToken);
int count = photo.Comments.Length;   // KO : NullException

Comment by Curt Mayers [ 04/Feb/18 ]

Really? You consider this a closed issue now?

I really believe that this should become core (and supported) functionality of MongoDB. The proposed solution, which I haven't actually tested and implemented yet, is far from straightforward, and really needs to be regression tested against a broad variety of use cases before it can really be considered a genuine solution.

The desired behavior, to have a global option or default to not serialized empty collections, is one that would be broadly useful, and used. Many of us have to work with large, complex objects to serialize that have variant behavior, and may include many empty collections that it simply makes no sense to serialize.

Comment by Robert Stam [ 02/Feb/18 ]

Since there seems to be a way to do this with existing hooks I'm closing this a Works as Designed.

Comment by Brian Buvinghausen [ 02/Nov/16 ]

Ok so I put together what I thought was the ultimate solution based on Craig & Robert's samples. At the end of the day ICollection is nearly useless so I left that out but the tenant here is IEnumerable works as the global catch all as long as you short circuit for strings which of course are an IEnumerable of chars.

IgnoreEmptyArraysConvention.cs

public class IgnoreEmptyArraysConvention : ConventionBase, IMemberMapConvention
{
	//List<> implements the majority of the common generic interfaces IEnumerable<T>, ICollection<T>, etc. so it should be the default concrete implementation to use
	private static readonly Type DefaultType = typeof(List<>);
 
	//Set up mapping dictionary to go from interface type to concrete type for the interfaces that List<> doesn't implement
	private static readonly IReadOnlyDictionary<Type, Type> InterfaceToConcreteMap = new Dictionary<Type, Type>
	{
		{ typeof(ISet<>), typeof(HashSet<>) },
		{ typeof(IProducerConsumerCollection<>), typeof(ConcurrentBag<>) }
	};
 
	public void Apply(BsonMemberMap memberMap)
	{
		if (!typeof(IEnumerable).IsAssignableFrom(memberMap.MemberType) || //Allow IEnumerable
			typeof(string) == memberMap.MemberType || //But not String
			typeof(IDictionary).IsAssignableFrom(memberMap.MemberType)) //Or Dictionary (concrete classes only see below)
			return;
 
		//*NOTE Microsoft was too stupid to make the generic dictionary interfaces implement IDictonary even though every single concrete class does
		//      They were also too stupid to make generic IDictionary implement IReadOnlyDictionary even though every single concrete class does I believe this should catch all
		if (memberMap.MemberType.IsGenericType && memberMap.MemberType.IsInterface)
		{
			var genericType = memberMap.MemberType.GetGenericTypeDefinition();
			if (genericType == typeof(IDictionary<,>) || genericType == typeof(IReadOnlyDictionary<,>))
				return;
		}
 
		if (memberMap.MemberType.IsArray) //Load Empty Array
		{
			memberMap.SetDefaultValue(() => Array.CreateInstance(memberMap.MemberType.GetElementType(), 0));
		}
		else if (!memberMap.MemberType.IsInterface) //Create ConcreteType directly
		{
			memberMap.SetDefaultValue(() => Activator.CreateInstance(memberMap.MemberType));
		}
		else if (memberMap.MemberType.IsGenericType) //Generic Interface type
		{
			var interfaceType = memberMap.MemberType.GetGenericTypeDefinition();
			var concreteType = InterfaceToConcreteMap.ContainsKey(interfaceType)
				? InterfaceToConcreteMap[interfaceType]
				: DefaultType;
			memberMap.SetDefaultValue(() => Activator.CreateInstance(concreteType.MakeGenericType(memberMap.MemberType.GetGenericArguments())));
		}
		else //This should just be the antique non generic interfaces like ICollection, IEnumerable, etc.
		{
			memberMap.SetDefaultValue(() => Activator.CreateInstance(typeof(List<object>)));
		}
		memberMap.SetShouldSerializeMethod(instance =>
		{
			var value = (IEnumerable)memberMap.Getter(instance);
			return value?.GetEnumerator().MoveNext() ?? false;
		});
	}
}

Comment by Frédéric Barrière [ 27/Dec/15 ]

I tried a similar solution for a specific collection (not by convention) and I think that this solution can actually cause errors.

  cm
    .MapMember(x => x.Friends)
    .SetDefaultValue(() => new List<Friend>())
    .SetShouldSerializeMethod(x => ((User)x).Friends.Count > 0);

It works fine when a retrieve my document from my collection, but fails when I do a projection during finding.

// It is ok.
var user = await this.collection
  .Find(filter)
  .FirstOrDefaultAsync();
return this.projection.Compile()(user);
 
// It is ko !!
return await this.collection
  .Find(filter)
  .Project(this.projection)
  .FirstOrDefaultAsync();

My projection looks like :

Expression<Func<User, UserDto>> projection = user => new UserDto
{
  Id = user.Id,
  ...,
  Friends = user.Friends.Select(x => new {.....})
}

The problem seems to be that in the mongo collection, the friends attribute is not present (not serailized) if it was empty, causing the ArgumentNullException.

Here is the error that occurs when a projection is made during query :

ArgumentNullException: The value can not be null. Parameter name : source
 
    at System.Linq.Enumerable.Select[TSource,TResult](IEnumerable`1 source, Func`2 selector)
    at lambda_method(Closure , ProjectedObject )
    at MongoDB.Bson.Serialization.Serializers.ProjectingDeserializer`2.Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
    at MongoDB.Bson.Serialization.IBsonSerializerExtensions.Deserialize[TValue](IBsonSerializer`1 serializer, BsonDeserializationContext context)
    at MongoDB.Driver.Core.Operations.CursorBatchDeserializationHelper.DeserializeBatch[TDocument](RawBsonArray batch, IBsonSerializer`1 documentSerializer, MessageEncoderSettings messageEncoderSettings)
    at MongoDB.Driver.Core.Operations.FindCommandOperation`1.CreateCursorBatch(BsonDocument result)
    at MongoDB.Driver.Core.Operations.FindCommandOperation`1.<ExecuteProtocolAsync>d__109.MoveNext()
    ......

Comment by Curt Mayers [ 01/Jul/13 ]

That is a really elegant solution. I suspect it will be useful to many people.

Thanks.

Comment by Robert Stam [ 01/Jul/13 ]

With the addition of a new overload of SetDefaultValue that lets you provide a delegate to create new instances of your default value (see CSHARP-767) you can now safely use an empty collection as the default value when deserializing documents where the elements for an empty collection have been omitted. You could modify your new convention to also configure a default value:

private static void RegisterConventionToNotSerializeEmptyLists()
{
    var pack = new ConventionPack();
    pack.AddMemberMapConvention("Do not serialize empty lists", m =>
    {
        if (typeof(ICollection).IsAssignableFrom(m.MemberType))
        {
            if (m.MemberType.IsArray)
            {
                var elementType = m.MemberType.GetElementType();
                m.SetDefaultValue(() => Array.CreateInstance(elementType, 0));
            }
            else
            {
                m.SetDefaultValue(() => Activator.CreateInstance(m.MemberType));
            }
            m.SetShouldSerializeMethod(instance =>
            {
                var value = (ICollection)m.Getter(instance);
                return value != null && value.Count > 0;
            });
        }
    });
    ConventionRegistry.Register("Do not serialize empty lists", pack, t => true);
}

Note that arrays have to be created differently, and that we have made the assumption that all other collection types have a no-argument constructor that can be called to create an empty collection.

Comment by Robert Stam [ 28/Jun/13 ]

When using a convention to omit empty collections from the serialized form the property will end up being null when the document is later deserialized.

Writing a convention that supplies an empty collection as the default value for missing elements requires the implementation of CSHARP-767 first.

Comment by Curt Mayers [ 28/Jun/13 ]

I owe you a beer.

Comment by Robert Stam [ 28/Jun/13 ]

The following worked for me for various collection types:

private static void RegisterConventionToNotSerializeEmptyLists()
{
    var pack = new ConventionPack();
    pack.AddMemberMapConvention("Do not serialize empty lists", m =>
    {
        if (typeof(ICollection).IsAssignableFrom(m.MemberType))
        {
            m.SetShouldSerializeMethod(instance =>
            {
                var value = (ICollection)m.Getter(instance);
                return value != null && value.Count > 0;
            });
        }
    });
    ConventionRegistry.Register("Do not serialize empty lists", pack, t => true);
}

I didn't have to test for ICollection<T> separately because at least for all the classes I looked at any class that implements ICollection<T> also implements ICollection.

I tested with array, ArrayList and List.

Comment by Robert Stam [ 28/Jun/13 ]

I'll experiment with writing the convention in a more general way.

One thing for you to think about is that if we suppress serialization of empty lists, when you read them back in the value of your property will be null instead of an empty list of the appropriate type. Is that OK?

Comment by Curt Mayers [ 28/Jun/13 ]

I was hoping for a robust convention that would apply to ALL collection types (arrays, dictionaries, lists, etc.), even if they were subclassed. Otherwise, it wouldn't be much improvement over writing a bunch of "ShouldSerialize" methods.

Nevertheless, I thank you for your help.

Is there any way of detecting whether a member implements IList or ICollection?

The object models I'm working with incorporate loads of lists, but I was thinking that a predefined convention that would prevent serialization of empty lists of all types would be generally useful.

Comment by Craig Wilson [ 28/Jun/13 ]

That just means that the if condition isn't right. Your next step is to figure out how to make that if condition trigger so that you can set the ShouldSerializeMethod. When you want this to trigger, what is the value of m.MemberType and can you write a very specific test for it.

Just an FYI, m.MemberType will never equal typeof(List<>). typeof(List<>) is an open generic type and a member type will never equal that. So you need something more specific. Notice how I tested to see if the MemberType was a generic type and, if so, got the GenericTypeDefinition() to compare it. Not sure why this didn't trigger for you, but I don't have your code so there are a couple of reasons, the first one being that none of your types use a List<T> as a property, but maybe instead use IList<T> or IEnumerable<T> or something. Like I said, I don't have your code, so I can't help you get that if statement correct.

Comment by Curt Mayers [ 28/Jun/13 ]

It doesn't actually work: empty lists continue to be serialized as "[]" as before.

I also tried

if ((m.MemberType == typeof(List<>)) || (m.MemberType.IsSubclassOf(typeof(List<>))))

to see if that would pick up the types correctly, but it doesn't.

Running the code in the debugger, I can see that nothing passes the "if" statement and makes it to the inner block, dispite the fact that my class has 6 lists in it.

Comment by Craig Wilson [ 27/Jun/13 ]

Documentation for this is here: http://docs.mongodb.org/ecosystem/tutorial/serialize-documents-with-the-csharp-driver/#ignoring-a-member-based-on-a-shouldserializexyz-method.

Comment by Craig Wilson [ 27/Jun/13 ]

EDIT: changed the code a little. The instance variable wasn't what I thought it was.

The below will work for any properties that are of type List<T>. You'll need to tweak this for your needs. The take-away here is that the "ShouldSerialize" method is settable via a convention and doesn't need to be defined anywhere in particular... Because of all the variations of this, I'm not sure there is an easy way to create a reusable convention/attribute that can be used to say "don't serialize this thing when it's empty." Anyways, hope this is what you are looking for.

var pack = new ConventionPack();
pack.AddMemberMapConvention("Don't serialize empty lists...", m =>
{
    if(m.MemberType.IsGenericType && m.MemberType.GetGenericTypeDefinition() == typeof(List<>))
    {
        m.SetShouldSerializeMethod(instance =>
        {
            var value = (ICollection)m.Getter(instance);
            return value != null && value.Count > 0;
        });
    }
});

Generated at Wed Feb 07 21:37:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.