[SERVER-33542] Using maxTime() on MongoDB 3.4 and 3.6 does not yield the same error code Created: 28/Feb/18  Updated: 29/Oct/23  Resolved: 16/Mar/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.6.0, 3.6.3
Fix Version/s: 3.6.4, 3.7.4

Type: Bug Priority: Major - P3
Reporter: Charles Sarrazin (Inactive) Assignee: Charlie Swanson
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-32208 Remove retrying of OperationFailed in... Backlog
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6
Steps To Reproduce:

use test
db.test.insert({a: 1})
db.test.find({$where: "sleep(101)"}).maxTimeMS(100)

Sprint: Query 2018-03-12, Query 2018-03-26
Participants:

 Description   

Using maxTime() does not yield the same exception when accessing a MongoDB deployment. It seems that timeouts now throw an error code 96 instead of 50 in recent versions of the MongoDB server.

In the Java driver it makes the driver throw a MongoQueryException instead of a MongoExecutionTimeoutException.

Sample Java code:

MongoDatabase db = (new MongoClient()).getDatabase("foo");
MongoCollection<Document> coll = db.getCollection("bar");
coll.insertOne(new Document());
coll.find(Filters.eq("$where", "sleep(100)")).maxTime(100, TimeUnit.MILLISECONDS).first();



 Comments   
Comment by Githook User [ 16/Mar/18 ]

Author:

{'email': 'charlie.swanson@mongodb.com', 'name': 'Charlie Swanson', 'username': 'cswanson310'}

Message: SERVER-33542 Avoid timing out cursor between batches

This will provide a guarantee that if a cursor times out, the error
returned to the client will be ExceededTimeLimit.
Branch: master
https://github.com/mongodb/mongo/commit/9eea82ee2a0e37da90cbb549d55c5eac8aa69a55

Comment by Githook User [ 15/Mar/18 ]

Author:

{'email': 'charlie.swanson@mongodb.com', 'name': 'Charlie Swanson', 'username': 'cswanson310'}

Message: SERVER-33542 Avoid swallowing error codes from PlanExecutor errors

(cherry picked from commit 3f723beb10caf6bd809b7df05c34b6a97402dadb)
Branch: v3.6
https://github.com/mongodb/mongo/commit/332f23a46c9f28e2c60cac7c19a93abac0c8359c

Comment by Githook User [ 12/Mar/18 ]

Author:

{'email': 'charlie.swanson@mongodb.com', 'name': 'Charlie Swanson', 'username': 'cswanson310'}

Message: SERVER-33542 Avoid swallowing error codes from PlanExecutor errors
Branch: master
https://github.com/mongodb/mongo/commit/3f723beb10caf6bd809b7df05c34b6a97402dadb

Comment by Charlie Swanson [ 07/Mar/18 ]

I have confirmed this is a SERVER problem and put a fix into code review. It seems this was introduced during a refactor for SERVER-30410 where we converted from throwing an exception to returning a status to the caller. It looks like some callers are ignoring the error code of that status and replacing it with OperationFailed.

Comment by Charles Sarrazin (Inactive) [ 05/Mar/18 ]

Also, this issue could be reproduced with both the Java and JS drivers. I suspect that all driver flavours are affected.

Comment by Charles Sarrazin (Inactive) [ 28/Feb/18 ]

And here in debug mode (log level 6 for all components):

2018-02-28T17:50:49.252+0100 D COMMAND  [conn8] run command dialo_dev.$cmd { find: "event", filter: { $where: "sleep(10)" }, maxTimeMS: 10.0, $db: "dialo_dev" }
2018-02-28T17:50:49.252+0100 D QUERY    [js] SMScope 0x7f8efb4c9c00 registered for op 8133
2018-02-28T17:50:49.252+0100 D STORAGE  [conn8] NamespaceUUIDCache: registered namespace dialo_dev.event with UUID c2727c1e-6b28-42f5-bc43-913029cf0d3d
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Beginning planning...
=============================
Options = INDEX_INTERSECTION SPLIT_LIMITED_SORT CANNOT_TRIM_IXISECT
Canonical query:
ns=dialo_dev.eventTree: $where
    code: sleep(10)
    scope: {}
Sort: {}
Proj: {}
=============================
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Index 0 is kp: { _id: 1 } unique name: '_id_' io: { v: 2, key: { _id: 1 }, name: "_id_", ns: "dialo_dev.event" }
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Index 1 is kp: { _id.type: 1, computedData.idEvent: 1, computedData.bodyType: 1, computedData.engineLevel: 1 } name: 'event' io: { v: 2, key: { _id.type: 1, computedData.idEvent: 1, computedData.bodyType: 1, computedData.engineLevel: 1 }, name: "event", ns: "dialo_dev.event" }
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Index 2 is kp: { _id.functionalId: 1 } name: '_id.functionalId_1' io: { v: 2, key: { _id.functionalId: 1 }, name: "_id.functionalId_1", ns: "dialo_dev.event" }
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Index 3 is kp: { type: 1.0, computedData.idEvent: 1.0, computedData.bodyType: 1.0, computedData.engineLevel: 1.0 } name: 'type_1_computedData.idEvent_1_computedData.bodyType_1_computedData.engineLevel_1' io: { v: 2, key: { type: 1.0, computedData.idEvent: 1.0, computedData.bodyType: 1.0, computedData.engineLevel: 1.0 }, name: "type_1_computedData.idEvent_1_computedData.bodyType_1_computedData.engineLevel_1", ns: "dialo_dev.event" }
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Rated tree:
$where
    code: sleep(10)
    scope: {}
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Planner: outputted 0 indexed solutions.
2018-02-28T17:50:49.252+0100 D QUERY    [js] SMScope 0x7f8f98d4a400 registered for op 8133
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Planner: outputting a collscan:
COLLSCAN
---ns = dialo_dev.event
---filter = $where
    code: sleep(10)
    scope: {}
---fetched = 1
---sortedByDiskLoc = 0
---getSort = []
2018-02-28T17:50:49.252+0100 D QUERY    [conn8] Only one plan is available; it will be run but will not be cached. query: { $where: "sleep(10)" } sort: {} projection: {}, planSummary: COLLSCAN
2018-02-28T17:50:49.252+0100 D STORAGE  [conn8] WT begin_transaction for snapshot id 199717
2018-02-28T17:50:49.264+0100 D STORAGE  [conn8] WT rollback_transaction for snapshot id 199717
2018-02-28T17:50:49.264+0100 D STORAGE  [conn8] WT begin_transaction for snapshot id 199718
2018-02-28T17:50:49.274+0100 D STORAGE  [WTJournalFlusher] flushed journal
2018-02-28T17:50:49.289+0100 E QUERY    [conn8] Plan executor error during find command: DEAD, stats: { stage: "COLLSCAN", filter: { $where: CodeWScope( sleep(10), {}) }, nReturned: 0, executionTimeMillisEstimate: 34, works: 4, advanced: 0, needTime: 4, needYield: 0, saveState: 1, restoreState: 1, isEOF: 0, invalidates: 0, direction: "forward", docsExamined: 3 }
2018-02-28T17:50:49.289+0100 D QUERY    [js] ImplScope 0x7f902eead640 unregistered for op 8133
2018-02-28T17:50:49.289+0100 D QUERY    [js] ImplScope 0x7f902eead640 unregistered for op 8133
2018-02-28T17:50:49.289+0100 I QUERY    [js] MozJS GC prologue heap stats -  total: 1970175 limit: 1153433600
2018-02-28T17:50:49.290+0100 I QUERY    [js] MozJS GC epilogue heap stats -  total: 352887 limit: 1153433600
2018-02-28T17:50:49.290+0100 I QUERY    [js] MozJS GC prologue heap stats -  total: 244591 limit: 1153433600
2018-02-28T17:50:49.291+0100 I QUERY    [js] MozJS GC epilogue heap stats -  total: 61415 limit: 1153433600
2018-02-28T17:50:49.291+0100 D STORAGE  [conn8] WT rollback_transaction for snapshot id 199718
2018-02-28T17:50:49.291+0100 I COMMAND  [conn8] command dialo_dev.event appName: "MongoDB Shell" command: find { find: "event", filter: { $where: "sleep(10)" }, maxTimeMS: 10.0, $db: "dialo_dev" } planSummary: COLLSCAN numYields:1 reslen:159 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } }, Collection: { acquireCount: { r: 2 } } } protocol:op_msg 39ms
2018-02-28T17:50:49.293+0100 D COMMAND  [conn8] run command dialo_dev.$cmd { isMaster: 1.0, forShell: 1.0, $db: "dialo_dev" }
2018-02-28T17:50:49.293+0100 D NETWORK  [conn8] Starting server-side compression negotiation
2018-02-28T17:50:49.302+0100 D NETWORK  [conn8] Compression negotiation not requested by client
2018-02-28T17:50:49.302+0100 I COMMAND  [conn8] command dialo_dev.$cmd appName: "MongoDB Shell" command: isMaster { isMaster: 1.0, forShell: 1.0, $db: "dialo_dev" } numYields:0 reslen:208 locks:{} protocol:op_msg 8ms
2018-02-28T17:50:49.377+0100 D STORAGE  [WTJournalFlusher] flushed journal
2018-02-28T17:50:49.481+0100 D STORAGE  [WTJournalFlusher] flushed journal
2018-02-28T17:50:49.582+0100 D STORAGE  [WTJournalFlusher] flushed journal
2018-02-28T17:50:49.686+0100 D STORAGE  [WTJournalFlusher] flushed journal
2018-02-28T17:50:49.792+0100 D STORAGE  [WTJournalFlusher] flushed journal
2018-02-28T17:50:49.796+0100 D COMMAND  [conn8] run command dialo_dev.$cmd { find: "event", filter: { $where: "sleep(10)" }, maxTimeMS: 10.0, $db: "dialo_dev" }
2018-02-28T17:50:49.796+0100 D QUERY    [js] SMScope 0x7f8f98d4a400 registered for op 8135
2018-02-28T17:50:49.796+0100 D STORAGE  [conn8] NamespaceUUIDCache: registered namespace dialo_dev.event with UUID c2727c1e-6b28-42f5-bc43-913029cf0d3d
2018-02-28T17:50:49.796+0100 D QUERY    [conn8] Beginning planning...
=============================
Options = INDEX_INTERSECTION SPLIT_LIMITED_SORT CANNOT_TRIM_IXISECT
Canonical query:
ns=dialo_dev.eventTree: $where
    code: sleep(10)
    scope: {}
Sort: {}
Proj: {}
=============================
2018-02-28T17:50:49.796+0100 D QUERY    [conn8] Index 0 is kp: { _id: 1 } unique name: '_id_' io: { v: 2, key: { _id: 1 }, name: "_id_", ns: "dialo_dev.event" }
2018-02-28T17:50:49.796+0100 D QUERY    [conn8] Index 1 is kp: { _id.type: 1, computedData.idEvent: 1, computedData.bodyType: 1, computedData.engineLevel: 1 } name: 'event' io: { v: 2, key: { _id.type: 1, computedData.idEvent: 1, computedData.bodyType: 1, computedData.engineLevel: 1 }, name: "event", ns: "dialo_dev.event" }
2018-02-28T17:50:49.796+0100 D QUERY    [conn8] Index 2 is kp: { _id.functionalId: 1 } name: '_id.functionalId_1' io: { v: 2, key: { _id.functionalId: 1 }, name: "_id.functionalId_1", ns: "dialo_dev.event" }
2018-02-28T17:50:49.796+0100 D QUERY    [conn8] Index 3 is kp: { type: 1.0, computedData.idEvent: 1.0, computedData.bodyType: 1.0, computedData.engineLevel: 1.0 } name: 'type_1_computedData.idEvent_1_computedData.bodyType_1_computedData.engineLevel_1' io: { v: 2, key: { type: 1.0, computedData.idEvent: 1.0, computedData.bodyType: 1.0, computedData.engineLevel: 1.0 }, name: "type_1_computedData.idEvent_1_computedData.bodyType_1_computedData.engineLevel_1", ns: "dialo_dev.event" }
2018-02-28T17:50:49.796+0100 D QUERY    [conn8] Rated tree:
$where
    code: sleep(10)
    scope: {}
2018-02-28T17:50:49.796+0100 D QUERY    [conn8] Planner: outputted 0 indexed solutions.
2018-02-28T17:50:49.843+0100 D QUERY    [js] SMScope 0x7f8ed9924c00 registered for op 8135
2018-02-28T17:50:49.843+0100 D COMMAND  [conn8] run command dialo_dev.$cmd { find: "system.js", $readPreference: { mode: "secondaryPreferred" }, $db: "dialo_dev" }
2018-02-28T17:50:49.843+0100 D -        [conn8] User Assertion: 50:operation exceeded time limit src/mongo/db/operation_context.cpp 174
2018-02-28T17:50:49.843+0100 D COMMAND  [conn8] assertion while executing command 'find' on database 'dialo_dev' with arguments '{ find: "system.js", $readPreference: { mode: "secondaryPreferred" }, $db: "dialo_dev" }': ExceededTimeLimit: operation exceeded time limit
2018-02-28T17:50:49.843+0100 I COMMAND  [conn8] command dialo_dev.$cmd appName: "MongoDB Shell" command: find { find: "system.js", $readPreference: { mode: "secondaryPreferred" }, $db: "dialo_dev" } exception: operation exceeded time limit code:ExceededTimeLimit numYields:0 reslen:106 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_msg 0ms
2018-02-28T17:50:49.844+0100 D -        [conn8] User Assertion: 50:operation exceeded time limit src/mongo/client/dbclientcursor.cpp 289
2018-02-28T17:50:49.844+0100 D QUERY    [js] ImplScope 0x7f902eead640 unregistered for op 8135
2018-02-28T17:50:49.844+0100 D -        [conn8] User Assertion: 50:operation exceeded time limit src/mongo/db/matcher/expression_where.cpp 117
2018-02-28T17:50:49.844+0100 D QUERY    [js] ImplScope 0x7f902eead640 unregistered for op 8135
2018-02-28T17:50:49.844+0100 I QUERY    [js] MozJS GC prologue heap stats -  total: 1991428 limit: 1153433600
2018-02-28T17:50:49.845+0100 I QUERY    [js] MozJS GC epilogue heap stats -  total: 369068 limit: 1153433600
2018-02-28T17:50:49.845+0100 I QUERY    [js] MozJS GC prologue heap stats -  total: 260772 limit: 1153433600
2018-02-28T17:50:49.845+0100 I QUERY    [js] MozJS GC epilogue heap stats -  total: 77596 limit: 1153433600
2018-02-28T17:50:49.845+0100 D COMMAND  [conn8] assertion while executing command 'find' on database 'dialo_dev' with arguments '{ find: "event", filter: { $where: "sleep(10)" }, maxTimeMS: 10.0, $db: "dialo_dev" }': ExceededTimeLimit: operation exceeded time limit
2018-02-28T17:50:49.845+0100 I COMMAND  [conn8] command dialo_dev.event appName: "MongoDB Shell" command: find { find: "event", filter: { $where: "sleep(10)" }, maxTimeMS: 10.0, $db: "dialo_dev" } exception: operation exceeded time limit code:ExceededTimeLimit numYields:0 reslen:106 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_msg 49ms

Comment by Charles Sarrazin (Inactive) [ 28/Feb/18 ]

Only one of the two commands appear in the log lines:

2018-02-28T17:48:18.325+0100 E QUERY    [conn8] Plan executor error during find command: DEAD, stats: { stage: "COLLSCAN", filter: { $where: CodeWScope( sleep(10), {}) }, nReturned: 0, executionTimeMillisEstimate: 21, works: 3, advanced: 0, needTime: 3, needYield: 0, saveState: 0, restoreState: 0, isEOF: 0, invalidates: 0, direction: "forward", docsExamined: 2 }

Comment by Charles Sarrazin (Inactive) [ 28/Feb/18 ]

When running the queries using the JS shell:

MongoDB Enterprise > db.event.find({$where: "sleep(10)" }).maxTimeMS(10)
Error: error: {
	"ok" : 0,
	"errmsg" : "operation exceeded time limit",
	"code" : 50,
	"codeName" : "ExceededTimeLimit"
}
MongoDB Enterprise > db.event.find({$where: "sleep(10)" }).maxTimeMS(10)
Error: error: {
	"ok" : 0,
	"errmsg" : "Executor error during find command: ExceededTimeLimit: operation exceeded time limit",
	"code" : 96,
	"codeName" : "OperationFailed"
}

The behaviour in the JavaScript shell is also inconsistent. The same query run twice seems to sometimes return a code 50, sometimes a code 96 (this seem to happen randomly).

Comment by Ross Lawley [ 28/Feb/18 ]

Moved to SERVER for review.

One thing to note is it looks like the original error message may have been wrapped: Executor error during find command: ExceededTimeLimit: operation exceeded time limit

Comment by Charles Sarrazin (Inactive) [ 28/Feb/18 ]

Good point. Indeed, we have the following full error: Query failed with error code 96 and error message 'Executor error during find command: ExceededTimeLimit: operation exceeded time limit' on server 127.0.0.5:53993'

So I guess the issue with the code is server-side.

Comment by Ross Lawley [ 28/Feb/18 ]

Hi charles.sarrazin,

I don't think this is a driver bug - error code 96: is a generic Operation Failed Exception https://github.com/mongodb/mongo/blob/r3.6.3/src/mongo/base/error_codes.err#L98

So we are reporting as per the servers description. It's strange that they don't use code: 50 here.

Perhaps move to Server?

Ross

Generated at Thu Feb 08 04:33:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.