[CSHARP-1545] IdGenerator possibly not honoured on ReplaceOne upsert? (Works on InsertOne) Created: 26/Jan/16  Updated: 27/Jan/16  Resolved: 27/Jan/16

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: 2.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kevin Versfeld Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

Hopefully I'm not misunderstanding something simple, but I have noticed I get different _id values generated in the database depending on whether I use InsertOne or ReplaceOne.
I try to avoid any dependancy on MongoDB libraries etc, so the Id field in my objects is always a string. However, I want to use the ObjectId generation pattern. In much older versions of the driver, I was using the StringObjectIdGenerator, and am trying to do so now too.
I'm using custom conventions to do my mapping (nothing strange in it, just that I use my own attributes to identify, for example, the ID field), and the important code looks like this:
var cmm = memberMap.ClassMap;
cmm.MapIdMember(memberMap.MemberInfo)
.SetSerializer(new StringSerializer(BsonType.ObjectId).WithRepresentation(BsonType.String))
.SetIgnoreIfDefault(true)
.SetIdGenerator(StringObjectIdGenerator.Instance);

If I call InsertOne, I get this:
"_id" : "56a75e69e73e572dbc1b75c4"

and if I call ReplaceOne (with a new document/no id set) I get this:
"_id" : ObjectId("56a75e6a87dc53ded31754e8")



 Comments   
Comment by Kevin Versfeld [ 27/Jan/16 ]

thanks for the comprehensive help, Craig. I think at this point, I'm going to try the very last suggestion you make, which is avoiding using upserts (i.e. ReplaceOne). I think it should be ok, hopefully I haven't missed anything. Please go ahead and close this issue as "By Design".

Comment by Craig Wilson [ 26/Jan/16 ]

1. To keep them as strings in your C# class, but stored as an ObjectId in the database, you'd flip your BsonType designations. In the case of the StringSerializer, the default is BsonType.String. So, this is telling us to use a string in my classes, but reprsent it as an ObjectId in the database.

.SetSerializer(new StringSerializer().WithRepresentation(BsonType.ObjectId))

2. Yeah, this is always a rough one. You can't update these documents because the _id field is immutable. Hence, what you'd actually be doing is deleting and re-inserting. There are a number of ways of doing this. What I would do is write a little C# console app that is repeatable.

  • Make a backup.
  • use an IMongoCollection<BsonDocument>
  • iterates over every document in the collection whose _id type is a string -> {_id: { $type: 7 }

    },

  • Change the _id of the document by parsing the current _id into an ObjectId. -> doc["_id"] = ObjectId.Parse((string)doc["_id"])
  • Insert the new document. (now we have the same document twice, but this is important so that we can ensure we haven't lost any data, missed any, etc...
  • Verify that we have exactly twice as many documents as we started with. (this only works if the system isn't still live, if it is, you may have to repeat this process).
  • Delete all the documents whose _ids are strings -> {_id: { $type: 2 }

    },

There are 2 more alternatives. (a) just leave the old data. The .NET serializers will handle reading the data fine. However, it could cause issues with querying if you use Linq and query on the Id field. (b) Don't use ReplaceOne such that the only way data gets into your system is through InsertOne.

Hope that helps.

Comment by Kevin Versfeld [ 26/Jan/16 ]

Oh, and: if I were to change to ObjectIds in the DB, er, is there a way to update existing data to change the format? (We have quite a bit...)

Comment by Kevin Versfeld [ 26/Jan/16 ]

Thanks for the response Craig, it makes a lot of sense actually. If you don't mind helping me out with a noob question, how would I change what I've done to store ObjectIds, but keep them as strings in my code? Is it just the Serializer and Representation bits?

Comment by Craig Wilson [ 26/Jan/16 ]

If you want a string represented in the database, you are doing the right thing. However, ReplaceOne will never have identifiers generated client-side. This is simply because we have no idea whether or not there is a match for the filter and needs an identifier. So, the only methods that use an id generator are InsertOne and InsertMany.

Thanks for the report and I wish we could do something, but there really isn't anything we can actually do. Depending on the trouble, it might make sense to store ObjectIds in the database instead of a string representation. They'll sort correctly and they'll take up a lot less space (12 bytes versus ~48 bytes) per document.

Let me know if you have any other questions. Otherwise, I'll close this as Works as Designed.
Craig

Comment by Kevin Versfeld [ 26/Jan/16 ]

Quick clarifications.
1) My class has a string Id property, and I would prefer a string representation in the database too (purely because that's how all our current data is stored, and I don't want to have to change it.....)
2) "memberMap" in code above is the BsonMemberMap parameter of the Apply method in my custom IMemberMapConvention implementation.

Generated at Wed Feb 07 21:39:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.