[SERVER-44827] Aggregate with $merge fails with command error, not write error Created: 25/Nov/19  Updated: 27/Oct/23  Resolved: 11/Dec/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Patrick Freed Assignee: Anton Korshunov
Resolution: Works as Designed Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-37191 Add support for write concerns for ne... Closed
Sprint: Query 2019-12-16
Participants:

 Description   

An aggregate with $merge will fail with a command error where other operations would typically fail with ok: 1 plus a write error (e.g. duplicate key). Is this the intended behavior? In the design doc for this stage, a duplicate key error is shown to be returned as a command error, but the doc doesn't state any rationale for it being in that format.

This leads to a weird situation where the aggregate failed with ok: 0, but documents were actually inserted into the into collection with no indication that they were. 

e.g.

MongoDB Enterprise repl0:PRIMARY> db.test.find()
{ "_id" : 1 }
{ "_id" : ObjectId("5ddc238fb9cf2f059ec3ba5a") }
MongoDB Enterprise repl0:PRIMARY> db.other.find()
{ "_id" : 1, "x" : 1 }
MongoDB Enterprise repl0:PRIMARY> db.runCommand({aggregate: "test", pipeline: [{"$merge": {into: "other", whenMatched: "fail"}}], cursor: {}})
{
	"ok" : 0,
	"errmsg" : "E11000 duplicate key error collection: test.other index: _id_ dup key: { _id: 1.0 }",
	"code" : 11000,
	"codeName" : "DuplicateKey",
	"keyPattern" : {
		"_id" : 1
	},
	"keyValue" : {
		"_id" : 1
	}
}
MongoDB Enterprise repl0:PRIMARY> db.other.find()
{ "_id" : 1, "x" : 1 }
{ "_id" : ObjectId("5ddc238fb9cf2f059ec3ba5a") }

 

It also presents a problem for statically typed drivers. If the server returns a command error and a write concern error in the same response, those drivers will simply ignore the write concern error, since write concern errors are only generally expected and parsed in the context of write errors (which occur if ok: 1). 

MongoDB Enterprise repl0:PRIMARY> db.runCommand({aggregate: "test", pipeline: [{"$merge": {into: "other", whenMatched: "fail"}}], cursor: {}, writeConcern: {w: 2, wtimeout: 10}})
{
	"writeConcernError" : {
		"code" : 64,
		"codeName" : "WriteConcernFailed",
		"errmsg" : "waiting for replication timed out",
		"errInfo" : {
			"wtimeout" : true
		}
	},
	"ok" : 0,
	"errmsg" : "E11000 duplicate key error collection: test.other index: _id_ dup key: { _id: 1.0 }",
	"code" : 11000,
	"codeName" : "DuplicateKey",
	"keyPattern" : {
		"_id" : 1
	},
	"keyValue" : {
		"_id" : 1
	}
}

e.g. this would just throw/return a CommandError/Exception driver side, with no mention of a write concern error.

Additionally, if the server only returns a WriteConcernError, the driver will throw a WriteException, as per the drivers' CRUD specification.

MongoDB Enterprise repl0:PRIMARY> db.runCommand({aggregate: "test", pipeline: [{"$merge": {into: "other"}}], cursor: {}, writeConcern: {w: 2, wtimeout: 10}})
{
	"cursor" : {
		"firstBatch" : [ ],
		"id" : NumberLong(0),
		"ns" : "test.test"
	},
	"writeConcernError" : {
		"code" : 64,
		"codeName" : "WriteConcernFailed",
		"errmsg" : "waiting for replication timed out",
		"errInfo" : {
			"wtimeout" : true
		}
	},
	"ok" : 1,
}

e.g. this would throw a WriteException(WriteError: null, WriteConcernError: ...).



 Comments   
Comment by Anton Korshunov [ 09/Dec/19 ]

So, I looked at the code and tracked it back to when it was added first (although this code had largely evolved since then, the error handling mechanism wasn't drastically changed).

Firstly, Charlie is right that we upconvert the first write error to a command error. It is done here.

However, I'm not sure that it was motivated by the 16MB bulk errors limit and a graceful way out. Likely we just plumbed in writing logic into the existing error handling mechanism in the aggregation pipeline. There was some discussion around reporting batch results, but it was slightly different topic.

So, for non-aggregation write operations we simply return a serialized WriteResult object back to the client, possibly appending a writeConcernError info to the reply.

For aggregation, the error handling mechanism is different and all aggregate command errors are returned as command errors. So, be it a non-OK status or an exception, it will be returned as a command error. This was so since day one when $out was added in v2.6.

This does makes sense to me, as a write operation (in case of $merge or $out pipelines) is executed in the context of an outer operation (which is a pipeline command), and so any error should be reported as a command error, like in case of any other read-only pipeline.

Note that reporting the writeConcernError is different from the normal codepath. That is, a command may succeed on the current node (and reported as "ok: 1") but the writeConcernError may still be added to the reply if we detect such an error. This is demonstrated in examples 2 and 3 from the original description. 

That said, this behaviour seems working as designed to me.

Comment by Charlie Swanson [ 25/Nov/19 ]

Passing to Query execution to triage. At a glance, I think this is by design because with $merge there may be large amounts of data being written, the bulk errors for which could easily exceed 16MB, and we wouldn't have a graceful way out at that point. Instead, we chose to upconvert the first write error to a command error.

Generated at Thu Feb 08 05:07:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.