[SERVER-22349] Potential hang in javascript if killOp() occurs while loading system.js functions Created: 29/Jan/16  Updated: 18/Nov/16  Resolved: 08/Feb/16

Status: Closed
Project: Core Server
Component/s: JavaScript, MapReduce
Affects Version/s: 3.1.7
Fix Version/s: 3.2.3, 3.3.2

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Mira Carey
Resolution: Done Votes: 0
Labels: code-only
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Sprint: Platforms 10 (02/19/16)
Participants:
Linked BF Score: 0

 Description   

After injecting functions from the system.js collection to the global scope of the JavaScript context, map-reduce acquires a lock on the database in MODE_S. Every 100 times the plan executor is ADVANCED, the operation context is checked to see if it has been interrupted. If the mapper function takes a long time to execute, then operations that acquire a lock on the database in MODE_X or MODE_IX will block behind the JavaScript execution. However, it seems undesirable to execute potentially long-running JavaScript (e.g. the mapper or reducer) if the map-reduce operation was interrupted in Scope::loadStored().

Some of the test cases in mr_killop.js use non-terminating mapper and reducer functions. This has been the source of build failures when running mr_killop.js as part of the parallel suite (i.e. jstests/parallel/basic.js and jstests/parallel/basicPlus.js) because the killOp() occurs while loading the stored functions.

I have only been able to reproduce this issue when running with SpiderMonkey (default JavaScript engine since 3.1.7+), but I'm not familiar enough with the V8 integration to know for certain that it isn't also affected.



 Comments   
Comment by Githook User [ 10/Feb/16 ]

Author:

{u'username': u'hanumantmk', u'name': u'Jason Carey', u'email': u'jcarey@argv.me'}

Message: SERVER-22349 Throw interruptions from loadStored

The JS engine's loadStored eats exceptions that occur while it's loading
functions from system.js. This also eats interruption exceptions, which
can lead to a situation where a map reduce job is killed during
loadStored, but the interrupt is lost. For tests where the map or reduce
stages are long or non-terminating, and we rely on killing them, this
can lead to hangs.

Re-throwing interrupts from the try/catch block around loadStored fixes
this behavior.

(cherry picked from commit 93f767caeebda5ffd295f935e734e0bf02da3356)
Branch: v3.2
https://github.com/mongodb/mongo/commit/ef07be66578231cc7bb7d2958ccb4fd4ad4dc391

Comment by Githook User [ 08/Feb/16 ]

Author:

{u'username': u'hanumantmk', u'name': u'Jason Carey', u'email': u'jcarey@argv.me'}

Message: SERVER-22349 Throw interruptions from loadStored

The JS engine's loadStored eats exceptions that occur while it's loading
functions from system.js. This also eats interruption exceptions, which
can lead to a situation where a map reduce job is killed during
loadStored, but the interrupt is lost. For tests where the map or reduce
stages are long or non-terminating, and we rely on killing them, this
can lead to hangs.

Re-throwing interrupts from the try/catch block around loadStored fixes
this behavior.
Branch: master
https://github.com/mongodb/mongo/commit/93f767caeebda5ffd295f935e734e0bf02da3356

Comment by Githook User [ 05/Feb/16 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: Revert "SERVER-22349 Throw exceptions from loadStored"

This reverts commit dfc320fe9c8a5227b08c77a87f52996cf40b0206.
Branch: master
https://github.com/mongodb/mongo/commit/22c3c3d13f8d2557c4be9c2060bcd8026023ebf8

Comment by Githook User [ 05/Feb/16 ]

Author:

{u'username': u'hanumantmk', u'name': u'Jason Carey', u'email': u'jcarey@argv.me'}

Message: SERVER-22349 Throw exceptions from loadStored

The JS engine's loadStored eats exceptions that occur while it's loading
functions from system.js. This also eats interruption exceptions, which
can lead to a situation where a map reduce job is killed during
loadStored, but the interrupt is lost. For tests where the map or reduce
stages are long or non-terminating, and we rely on killing them, this
can lead to hangs.

Removing the try/catch block around loadStored fixes this behavior.
Branch: master
https://github.com/mongodb/mongo/commit/dfc320fe9c8a5227b08c77a87f52996cf40b0206

Comment by Mira Carey [ 05/Feb/16 ]

This can also occur in group and $where

Comment by Max Hirschhorn [ 29/Jan/16 ]

The following patch seems to resolve the issue (in map-reduce at least). It's unclear whether this affects other places where JavaScript is evaluated on the server-side, e.g. $where and db.eval().

diff --git a/src/mongo/db/commands/mr.cpp b/src/mongo/db/commands/mr.cpp
index ee7ece9..d93b748 100644
--- a/src/mongo/db/commands/mr.cpp
+++ b/src/mongo/db/commands/mr.cpp
@@ -1360,6 +1360,7 @@ public:
 
         try {
             state.init();
+            txn->checkForInterrupt();
             state.prepTempCollection();
             ON_BLOCK_EXIT_OBJ(state, &State::dropTempCollections);
 

We may want to do one or both of the following:

  • Stop suppressing interrupted statuses in Scope::loadStored().
  • Call MozJSImplScope::_checkErrorState() prior to evaluating/executing JavaScript. It's possible that the _status member has already been set to a non-OK status and the operation has been unregistered from the engine. It seems odd that we can still use the MozJSImplScope to invoke more JavaScript under these circumstances.
Generated at Thu Feb 08 04:00:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.