[SERVER-41578] findOneAndReplace should ignore "_id" when upserting Created: 07/Jun/19  Updated: 13/Sep/23  Resolved: 30/Jun/19

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Thomas Tutko Assignee: Asya Kamsky
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Currently, executing the following operations in the shell will produce error code 66, "After applying the update, the (immutable) field '_id' was found to have been altered to _id..."

var tm = {
    "_id": ObjectId("571fc8d290f5282a50778960"),
    "parentId": null,
    "name": "test",
    "containerType" : "IP"
};
db.Containers.findOneAndReplace({"parentId": null, "containerType": "IP"}, tm, { "upsert": true, "returnNewDocument": true } );var tm2 = {
    "_id": ObjectId("571fc8d290f5282a50778961"),
    "parentId": null,
    "name": "test2",
    "containerType" : "IP"
};
db.Containers.findOneAndReplace({"parentId": null, "containerType": "IP"}, tm2, { "upsert": true, "returnNewDocument": true } );

I understand, based on the documentation, that this is by design as the documentation states "The <replacement> document cannot specify an _id value that differs from the replaced document." as well as "MongoDB will add the _id field to the replacement document if it is not specified in either the filter or replacement documents. If _id is present in both, the values must be equal."

However, this is flawed in my opinion. Why not have the "_id" property ignored on the replacement document?

The use case for this that based on the example above, we don't want to change the "_id" but instead what we want is only one entry where the containerType is "IP" and the parentId is null. However, due to the source of these commands, when we call the command we will have an assigned "_id" from a javascript web client.

Expected Result:
The first call in which there is no document matching

{ "parentId": null, "containerType": "IP" }

inserts the document with the specified _id and returns the new document (returnNewDocument = true).

Subsequent calls that may come in with different _id's should replace all other fields except for the _id field and return the new document with the existing _id instead of the replacement document's _id.

Workaround(s):
1. Query first to see if a document exists that matches

{"parentId": null, "containerType": "IP" }

and if it does, return it, otherwise insert the new document. This, however, would not be atomic without using a transaction which would not be preferable.
2. Somehow avoid sending the _id. While this works, we want the _id to be known/specified ahead of time in the case of an insert.
3. Use standard update with upsert using setOnInsert for _id then query for the resulting document. This is essentially a reverse of workaround #1 and is not atomic.

This is by no means a deal breaker but to me it's a little bit counter-intuitive that when upserting in a situation where you do not use _id in the filter criteria that _id would be assumed to be something the user is trying to replace.

Thank you for your time and consideration!



 Comments   
Comment by Jim Paton [ 13/Sep/23 ]

I just found this issue while experiencing a similar problem using FindOneAndReplace in the .NET client. If I understand correctly, the Upsert option is meaningless in the context of the FindOneAndReplace call.

If the supplied filter does not match an existing document, you need to know this in advance in order to be able to supply a new, unique value for the id in the document because one will not be auto-generated; however if it does match an existing document, you need to supply the id value of the existing document in the replacement document in order to update.

There is therefore no way to do an actual upsert (create-or-replace) without knowing in advance whether the document that matches your filter already exists or not. This is completely counter-intuitive.

Comment by Asya Kamsky [ 25/Jan/21 ]

> due to the source of these commands, when we call the command we will have an assigned "_id" from a javascript web client

Why not fix the web client here? The _id shouldn't be unconditionally added, in fact, it should only be added if it's not already present and when it's an insert (and MongoDB driver or server will already do that).

Comment by Calvin Lobo [ 15/Jan/21 ]

You can use findAndModify (findOneAndModify rather than findOneAndReplace), with upsert option (and option new:true),and with $setOnInsert and it will return the document that's updated/created. That's atomic. You don't need to switch to using update.

Turns out that this does not perform an upsert in the case where the document exists because $setOnInsert does not execute when the document exists. Furthermore I cannot use both $set and $setOnInsert together without getting path conflict errors.

EDIT: I was able to get it working by using $setOnInsert to only include the _id field awhile the rest of the updates go in $set.

This is incredibly frustrating and i'm really surprised that there is not a simpler way to do things.

The only thing I can think of now is to check if the document exists before hand and then choose either an insert or an upsert operation.

I would really like to see this ticket opened again with Thomas' flag suggestion. It wouldn't break backwards compatibility and would simplify what I think is a pretty common use-case of performing a replace on a document that has a user-defined _id.

Comment by Calvin Lobo [ 13/Jan/21 ]

This is by no means a deal breaker but to me it's a little bit counter-intuitive that when upserting in a situation where you do not use _id in the filter criteria that _id would be assumed to be something the user is trying to replace. 

I too would love to see a simpler way to do this. I feel like it's a pretty common scenario to have a user-defined _id and want to do a replace on the document.

While i'm on mongo 3.6 now, I'm hoping to move to 4+. Perhaps with transactions I can implement a delete first and then insert.

Comment by Thomas Tutko [ 12/Jul/19 ]

Thank you for your reply.

In regards to the change not being backwards compatible, my thought was that it would be a part of the options to explicitly say ignore _id. For example:

db.Containers.findOneAndReplace({"parentId": null, "containerType": "IP"}, tm2, { "upsert": true, "returnNewDocument": true, "allowIdChange": true } );

That being said, I will give your suggestion a try with findOneAndModify. My biggest gripe with that approach is that while yes, it is atomic, if the document has many fields (as my actual use case does), I have to build up a large set of setOnInsert statements instead of just serializing the whole c# object (or even full json object) that I have. Unless I am unaware of a better way of doing that, which is certainly possible, it's not really a good solution in my opinion.

Comment by Asya Kamsky [ 30/Jun/19 ]

I have a couple of questions/follow-up comments in context of the fact that we will not implement the suggestion on the server as it would be backwards breaking to all applications that rely on _id not being ignored (and in cases where it conflicts, do NOT want the new value to incorrectly overwrite existing value).

Regarding mentioned workarounds 3:
> Use standard update with upsert using setOnInsert for _id then query for the resulting document. This is essentially a reverse of workaround #1 and is not atomic.

You can use findAndModify (findOneAndModify rather than findOneAndReplace), with upsert option (and option new:true),and with $setOnInsert and it will return the document that's updated/created. That's atomic. You don't need to switch to using update.

You can then compare $setOnInsert _id you sent to the returned _id from the updated document (though you can just use returned _id and check nUpserted and nModified values from the result of the command).

Comment by Danny Hatcher (Inactive) [ 10/Jun/19 ]

Hello Tom,

Thank you for your detailed description of your expectations as to how upserts should work. I will pass this along to our Query team for them to provide feedback.

Danny

Generated at Thu Feb 08 04:58:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.