[JAVA-1924] When using upsert=true with a custom codec, encodercontext returns isEncodingCollectibleDocument = false Created: 14/Aug/15 Updated: 31/Aug/15 Resolved: 27/Aug/15 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | API, BSON, Codecs |
| Affects Version/s: | 3.0.0, 3.0.1, 3.0.2, 3.0.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Guillermo Campelo | Assignee: | Unassigned |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | Bug | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Description |
|
when using : gameCollection.findOneAndUpdate(filter, update, new FindOneAndUpdateOptions().upsert(true)); SomeEntityCodec's encode method is called with a context that returns encodingCollectibleDocument=false See: bson.toBsonDocument(documentClass, codecRegistry) neither RawBsonDocument(wrapped, codec), nor BsonDocumentWrapper(wrappedDocument, encoder) takes into account the upsert flag(if it is true, it SHOULD be a encodingCollectibleDocument |
| Comments |
| Comment by Ross Lawley [ 27/Aug/15 ] |
|
Hi guicamest,
This is definitely is not true and not the case. Lets look at the Document class. It has a DocumentCodec that is an implementation of the CollectibleCodec<Document>, when inserting, we generate the _id and mutate the document - allowing the user to know the _id of the inserted document. Should encode always include an _id? No! because Documents are used for all sorts of things, queries, sorts, index definitions etc... Making encode always include the _id in most circumstances would be wrong. Remember as jeff.yemin mentioned previously, isEncodingCollectibleDocument is for the cases when you are representing a BSON Document as it would be stored by MongoDB. In these scenarios _id if present comes first. That is all that flag is for. Should isEncodingCollectibleDocument = true always be the case?This is nuanced and depends on the situation. For example take the following compound index: {price: 1, _id: 1}. If isEncodingCollectibleDocument was always true that would make the index mean something totally different. When should a CollectibleCodec encode a _id into the document?In what circumstances do we know that we are actually representing a document as it will be stored in MongoDB? The answer is only when doing an insert. So only in those circumstances can we call generateIdIfAbsentFromDocument. (This may or may not add an _id, depending on implementation, if it doesn't then MongoDB will generate the _id). What about Upserts?Update queries can take multiple keys, in the Bson representation of the update eg : $set $setOnInsert. Does the Bson type representing an update, represent a Document as it is to be stored by MongoDB? The answer is no, but the $set part could represent the document or it could equally be updates to fragments of the Document that is stored in MongoDB. – I hope that clarifies the current design and the reasons it acts as such. As the Bson update provided to an upsert doesn't necessarily represent a Document as it will be stored, I'm closing this ticket as "Won't Fix". Currently, you will manually have to ensure _id is in either the query or in the $setOnInsert part of the update. However, I have added Ross |
| Comment by Guillermo Campelo [ 27/Aug/15 ] |
|
Hi Ross, thanks for the depth of the explanation! IMHO, the parameter should be renamed, as a CollectibleCodec is directly related to ids and isEncodingCollectibleDocument gives the idea of whether the id should be generated or not. May I suggest the isInsert name? Just to be clear, any implementation of CollectibleCodec must include the _id field in it's encode method, independently of it is an insert or an update? If that's OK, please close this issue as not a bug. Thanks a lot for the clarification about this issue and keep up with the good work! |
| Comment by Ross Lawley [ 24/Aug/15 ] |
|
Hi guicamest, I can understand the confusion, however, the driver does not know if any given upsert operation will result in an update or an insert. Only the MongoDB server can tell. If the operation results in an insert and the _id isn't supplied, the MongoDB server automatically generates one. For that reason we can't automatically generate an _id for upserts that are inserts rather than updates. I hope that clarifies the reasons why isEncodingCollectibleDocument is false for upserts. Ross |
| Comment by Guillermo Campelo [ 19/Aug/15 ] |
|
Hi Jeff, any updates? |
| Comment by Guillermo Campelo [ 15/Aug/15 ] |
|
It's kind of confusing, CollectibleCodec states A Codec that generates complete BSON documents for storage in a MongoDB collection and has methods for checking/generating ids, but as you say, encodingCollectibleDocument has nothing to do with the _id field except for the order. Besides that, how is a custom CollectibleCodec supposed to know when to generate an _id field or not? If it is an upsert=false, then it shouldn't generate the _id field, but if it is an upsert=true and the document is to be inserted, it should generate it. |
| Comment by Jeffrey Yemin [ 14/Aug/15 ] |
|
No, that's not what the flag indicates. While one could potentially use it for other purposes, It's currently used in the driver only to encode the fields in such a way that the _id field is placed first in the BSON representation of the document. |
| Comment by Guillermo Campelo [ 14/Aug/15 ] |
|
Well, the flag indicates whether or not include the "_id" field in the serialization, correct? If it is an upsert=true, it might be an insert, in which case the _id should be serialized. |
| Comment by Jeffrey Yemin [ 14/Aug/15 ] |
|
Yes, that's the expected behavior, and it is not governed at all by the upsert flag. The encodingCollectibleDocument property will be true only when encoding a document that is going to be inserted in a collection, but never for filters or updates (which use $ operators like $set). You can see it's intended usage here, for example. What are you trying to use the flag for? |