[SERVER-18076] Upsert rejected with error if _id assigned to in $setOnInsert and _id equality match specified in query predicate Created: 15/Apr/15  Updated: 06/Dec/22  Resolved: 29/Oct/18

Status: Closed
Project: Core Server
Component/s: Write Ops
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: J Rassi Assignee: Backlog - Query Team (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-5289 _id field not taken from query during... Closed
is related to SERVER-6913 update allows upsert to "modify" _id Closed
Assigned Teams:
Query
Participants:

 Description   

Upsert are rejected with an error if the update meets both conditions:

  • _id is assigned to from $setOnInsert
  • an equality match on _id is specified in query predicate

See the following reproduction:

> db.foo.update({_id: 0}, {$setOnInsert: {_id: 1}, $set: {x: 1}}, {upsert: true})
WriteResult({
	"nMatched" : 0,
	"nUpserted" : 0,
	"nModified" : 0,
	"writeError" : {
		"code" : 66,
		"errmsg" : "After applying the update to the document {_id: 0.0 , ...}, the (immutable) field '_id' was found to have been altered to _id: 1.0"
	}
})

And, note that this does not generate an error for other fields:

> db.foo.update({z: 0}, {$setOnInsert: {z: 1}, $set: {x: 1}}, {upsert: true})
WriteResult({
	"nMatched" : 0,
	"nUpserted" : 1,
	"nModified" : 0,
	"_id" : ObjectId("552eca4fb15badb258938496")
})
> db.foo.find()
{ "_id" : ObjectId("552eca4fb15badb258938496"), "z" : 1, "x" : 1 }
>



 Comments   
Comment by Asya Kamsky [ 07/Jul/21 ]

adrian.blandin@sovo-tech.com (this isn't the best place to discuss workarounds but FYI a way to get this to work without duplicating info is by doing a range find on compound _id field:

db.example.update( {
        "_id"    : {$gte: { owner: ObjectId("60e39aedac911ac328d34112"), this:MinKey}, 
                        $lte:{ owner: ObjectId("60e39aedac911ac328d34112"), this: MaxKey}}, 
       "chunk-array"  : { $not: { $size: 1000 } }
 }, {  
       $setOnInsert:  {  "_id" :  {  "owner" : ObjectId("60e39aedac911ac328d34112"),  "this"  : ObjectId() }  }, 
       $push: { "chunk-array" : { "value" : "test1" } }  
}, 
{upsert: true})
WriteResult({
	"nMatched" : 0,
	"nUpserted" : 1,
	"nModified" : 0,
	"_id" : {
		"owner" : ObjectId("60e39aedac911ac328d34112"),
		"this" : ObjectId("60e5ff145d414c5469ae7480")
	}
})

Note however that this approach is not thread-safe in the sense that two threads could create a new object with a single chunk each if this field is arbitrarily assigned at insert.

Comment by Adrian B [ 07/Jul/21 ]

Here is an use case : 

Having a collection storing arrays of any size (i.e. greater than 16 MB) for another document ("owner"), where the array can be splitted into chunks.
When the "current chunk" reachs its maximum number of elements, another one is created. To do this atomically, we use an upsert.
If we want to be able to watch for changes on the array for a specific "owner", without returning the full document, we need to have the owner's OID in the "_id" field.

That would look like this :

 

update(
{                                        
  "_id.owner"    : ObjectId("60e39aedac911ac328d34112"),    
  "chunk-array"  : { $not: { $size: 1000 } }                                      
},
{                                                   
  $setOnInsert:                                     
  { 
    "_id" : 
    {
      "owner" : ObjectId("60e39aedac911ac328d34112"),
      "this"  : ObjectId()
    }                  
  },
  $push: { "chunk-array" : { "value" : "test1" } } 
},
{upsert: true})

The code above would indicate an error saying that the field "_id" is immutable.

 

Without allowing to set the "_id" in an upsert, we can still do the above by duplicating the information :

update(
{                                        
  "owner": ObjectId("60e39aedac911ac328d34112"),
  "chunk-array" : { $not: { $size: 1000 } }                                     
},
{                                                   
  $setOnInsert:                                     
  {  
    "_id" : { "owner": ObjectId("60e39aedac911ac328d34112"), "this": ObjectId()},
    "owner": ObjectId("60e39aedac911ac328d34112")
  },
  $push: { "chunk-array" : { "value" : "test1" } } 
},
{upsert: true})

 

So, it's not a major issue, but we could expect the first code above to work.

 

Comment by Asya Kamsky [ 29/Oct/18 ]

The behavior of disallowing changing _id when matching a document based on _id is consistent with how we treat immutable fields and was implemented this way deliberately.

 

Comment by Asya Kamsky [ 29/Jan/18 ]

This seems works-as-designed/intended.

If you are searching for _id 1 and don't find it, then on upsert it doesn't make sense to set _id to anything but 1, at least I can't come up with a reasonable use case.

In the example given, the issue was querying with null _id which seems like a logic error somewhere, unless I'm missing something.

If it's the case that the only condition on which to determine that a new record should be inserted is if the object is null for query-check but populated for setOnInsert portion of the update, then it sounds like the issue is in the Java/Morphia layer.

Comment by Rajesh Sharma [ 16/Apr/15 ]

Let me explain why this was useful to use.

In our Java based application Client was sending us a Java Object and we were inserting and updating based on Object id. The sample code was as follow(Java Morphia).

Query q = ds.createQuery(CustomBenchmarkVO.class).field("_id").equal(customBenchmarkVO.getId());
UpdateResults r = ds.updateFirst(q, customBenchmarkVO, true);

So in 2.4 when the getId was null the server was inserting that Object as a new document with valid _id key. That make sense as it was a new document so Object id was null for us. Later in 2.6 this production code started breaking. In 2.6 the same code was creating document where _id as null. I was confused why _id is null. Later any call where getId was null was updating the same _id:null document and new document creation stopped. We lost so much data as same object was changed instead of new object creation behavior of 2.4.

Comment by J Rassi [ 16/Apr/15 ]

The reporter of SERVER-18075 was exercising this use case. rajeshks117, could you briefly outline why this behavior was useful to your application?

Comment by Scott Hernandez (Inactive) [ 16/Apr/15 ]

What is the practical use case? What application or system would need this
behavior?

Comment by J Rassi [ 16/Apr/15 ]

Are you asking to change the _id field on the newly inserted document, to make it different than the value in the query?

Yep.

We currently do not allow a query + update/insert where immutable field values differ between the two: this is true for the _id field or shard key field(s).

Indeed. This is submitted as a feature request for supporting _id for this use case.

Comment by Scott Hernandez (Inactive) [ 16/Apr/15 ]

Are you asking to change the _id field on the newly inserted document, to make it different than the value in the query?

We currently do not allow a query + update/insert where immutable field values differ between the two: this is true for the _id field or shard key field(s).

Generated at Thu Feb 08 03:46:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.