[SERVER-11484] Interrupted commands by maxTimeMS on mongos should propagate the error code to client Created: 30/Oct/13  Updated: 11/Jul/16  Resolved: 17/Dec/13

Status: Closed
Project: Core Server
Component/s: MapReduce, Sharding
Affects Version/s: 2.5.3
Fix Version/s: 2.5.5

Type: Bug Priority: Major - P3
Reporter: Siyuan Zhou Assignee: J Rassi
Resolution: Done Votes: 0
Labels: 26qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Participants:

 Description   

Since drivers depend on the error code to throw exception for maxTimeMS timeout, interrupted commands on mongos should propagate the error code.

The following commands have this issue.
+ aggregate. Wrong error code 17022, 50 is expected (known bug for javascript interruption error code).

The others lack top-level error code.
+ count
+ mapreduce
+ text
+ moveChunk
+ collStats

Commands output:

----
Testing aggregate
----
 
{
     "code" : 17022,
     "ok" : 0,
     "errmsg" : "exception: sharded pipeline failed on shard shard0001: { ok: 0.0, errmsg: \"operation exceeded time limit\", code: 50 }"
}
 
----
Testing count
----
 
{
     "shards" : {
         
     },
     "cause" : {
          "ok" : 0,
          "errmsg" : "operation exceeded time limit",
          "code" : 50
     },
     "ok" : 0,
     "errmsg" : "failed on : shard0000"
}
 
----
Testing mapreduce
----
 
{
     "ok" : 0,
     "errmsg" : "MR parallel processing failed: { ok: 0.0, errmsg: \"operation exceeded time limit\", code: 50 }"
}
 
----
Testing text
----
 
{
     "rawresult" : {
          "ok" : 0,
          "errmsg" : "operation exceeded time limit",
          "code" : 50
     },
     "ok" : 0,
     "errmsg" : "failure on shard: shard0000:localhost:30000: errmsg: \"operation exceeded time limit\""
}
 
----
Testing moveChunk
----
{
     "cause" : {
          "ok" : 0,
          "errmsg" : "operation exceeded time limit",
          "code" : 50
     },
     "ok" : 0,
     "errmsg" : "move failed"
}
 
----
Testing collStats
----
{
     "sharded" : true,
     "ok" : 0,
     "errmsg" : "failed on shard: { ok: 0.0, errmsg: \"operation exceeded time limit\", code: 50 }"
}

For example, to reproduce this issue, running map-reduce against sharded collection with a small maxTimeMS gives the following error.

mongos> var mapReduceArg = {
...     mapreduce: "foo",
...     map: function() { emit(this.i, 1); },
...     reduce: function(key, value) { return value.length },
...     out: { replace: "mapReduceOut" },
...     maxTimeMS: 10,
... };
mongos> db.runCommand(mapReduceArg)
{
	"ok" : 0,
	"errmsg" : "MR parallel processing failed: { errmsg: \"exception: JavaScript execution terminated\", code: 13475, ok: 0.0 }"
}



 Comments   
Comment by J Rassi [ 17/Dec/13 ]

Not touching deprecated "text" command, others listed all fixed.

Comment by Githook User [ 17/Dec/13 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-11484 mongos bubbles up "code" field for some command failures

Affects commands aggregate, collStats, count, mapReduce, and
moveChunk. If a failure on one or more shards causes the command to
fail, mongos will add a "code" field to the response if the error was
common to all shards that failed.
Branch: master
https://github.com/mongodb/mongo/commit/3050f432fe144a16516f7de0a346c0586f650ac6

Generated at Thu Feb 08 03:25:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.