[CSHARP-1999] Support for readonly properties/fields that contain instances Created: 13/Jun/17  Updated: 08/Feb/23  Resolved: 28/Oct/20

Status: Closed
Project: C# Driver
Component/s: Serialization
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Daniel Hegener Assignee: James Kovacs
Resolution: Won't Fix Votes: 4
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by CSHARP-169 Deserialize Arrays to Properties with... Closed
Case:

 Description   

We are dealing with lots of entities that create instances for contained properties/fields as part of their instantiation like this:

public List<int> Foo { get; } = new List<int>();

or more generic

public SomeType Bar { get; } = new SomeType();

The C# driver is able to serialize these properties nicely once we map them explicitly. However, they are considered readonly so deserialization gets skipped and our entities end up with not populated instances.

In order to solve this, we would suggest to enhance the driver in a way that - similar to SetIgnoreIfNull - one can select to skip the instantiation of a type and populate an existing instance on the target object.

Let me know your thoughts. I'd be happy to implement that right away if you agree.



 Comments   
Comment by Daniel Hegener [ 08/Dec/20 ]

Thanks for eventually looking into my suggestion. I do see the gentle increase in complexity, too, of course, so I can accept your negative decision based on this. I would like to point out three things, though:

  1. Your suggestion above shows how to deal with readonly instances. It does not, however, help to reduce the amount of pointless allocations that happen in the example above during deserialization: The "new List<int>()" and "new SomeType()" constructors will be called during deserialization and do their magic (potentially instantiating lots of other stuff) but the created instances will be thrown out and GC'ed later in the deserialization process. My PR avoids this behaviour.
  2. What I called a "side-product" in the last paragraph of my PR actually has some interesting potential beyond the two cases that I already mentioned in the PR itself: By leveraging incrementally populated instances which the PR enables, people could overcome the 16MB document hurdle pretty easily (for the loading/deserialization part at least). You could e.g. store a massive JSON document (>16MB), possibly in multiple collections using the same _id, and then have the driver reassemble that document very efficiently by deserializing all relevant documents into the very same instance. We had to do some rather nasty magic in order to deal with some large contained lists which could be done more easily and also faster with the suggested change.
  3. To sell my idea even further, another cool feature that would be possible with my PR would be to store incremental changes to versioned documents. We develop a system that stores sizable documents representing financial positions that slowly change over time. For each revision, we currently store the full version of the latest document, even in cases where only one field has changed. This results in quite some wasted processing and transmission time as well as disk space, I presume (I don't know to what extent e.g. compression alleviates this). So with my PR we could just store "the diffs" and load all of them sequentially into the same instance which would ultimately give us the desired result. It's not quite as easy due to e.g. indexing but it's just one example where I could see benefits. Another use case would be perhaps sensor data where you get a set of data coming in over time and instead of storing all information every time, you might just want to store the difference to the previous one and then produce a fully "merged" latest instance based on a replay of all changes up to a certain point in time as and when required. Or so...
Comment by James Kovacs [ 27/Oct/20 ]

We apologize for letting this PR sit for 3 years without public comment. While we appreciate the effort that it took to implement the feature, it makes significant changes to how serialization works internally.

At a high level, the Deserialize method is responsible for creating and populating the object itself and only itself. If there are child documents or arrays, responsibility for instantiating and deserializing them is left to the child's deserializer. This is why readonly properties and fields are ignored. Because the owning object has no way of setting these properties/fields.

The provided PR changes deserialization to optionally allow a parent object to provide a child instance for the child deserializer to populate. This changes the semantics of deserialization from strictly bottom-up building to a combination of bottom-up and top-down. This increases the complexity of the deserialization code and may result in new subtle bugs.

We do not feel that the increase in complexity is warranted as we can already support readonly properties and fields using a private [BsonConstuctor] as follows. (Note the constructor doesn't have to be private, but can be if you don't want it callable from your code.)

public class ContainingClass {
    public ContainingClass() {}
 
    [BsonConstructor]
    private ContainingClass(List<int> foo, SomeType bar) {
        Foo.AddRange(foo);
        Bar.Name = bar.Name;
    }
 
    public ObjectId Id { get; private set; }
    [BsonElement]
    public List<int> Foo { get; } = new List<int>();
    [BsonElement]
    public SomeType Bar { get; } = new SomeType();
 
    public override string ToString() {
        return $"{Id} / {string.Join(",", Foo)} / {Bar}";
    }
}
 
public class SomeType {
    public string Name { get; set; }
 
    public override string ToString() {
        return $"{Name}";
    }
}

As you can see, we declare a private constructor marked with [BsonConstuctor], which takes the deserialized child instances and can either set the readonly properties/fields or use the instances to populate the existing instances. Note that Foo and Bar must be explicitly attributed with [BsonElement] as readonly fields/properties are not automatically added to class maps.

If you do not want to modify your POCOs with a private [BsonConstructor] and [BsonElement] attributes, the same effect can be achieved using a BsonClassMap configured using MapProperty and MapCreator:

BsonClassMap.RegisterClassMap<ContainingClass>(cm => {
    cm.AutoMap();
    cm.MapProperty(c => c.Foo);
    cm.MapProperty(c => c.Bar);
    cm.MapCreator((Func<List<int>,SomeType, ContainingClass>)((foo,bar) => {
        var newInstance = new ContainingClass();
        newInstance.Foo.AddRange(foo);
        newInstance.Bar.Name = bar.Name;
        return newInstance;
    }), "Foo", "Bar");
});

Comment by Daniel Hegener [ 24/Nov/17 ]

PR: https://github.com/mongodb/mongo-csharp-driver/pull/303

Generated at Wed Feb 07 21:41:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.