[SERVER-22279] SubplanStage fails to register its MultiPlanStage Created: 22/Jan/16  Updated: 06/Sep/18  Resolved: 26/Jan/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.1, 3.3.0
Fix Version/s: 3.2.3, 3.3.1

Type: Bug Priority: Major - P3
Reporter: Marco Biscaro Assignee: Mathias Stearn
Resolution: Done Votes: 0
Labels: code-only
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File dump-bug_test.tar.gz    
Issue Links:
Depends
Duplicate
is duplicated by SERVER-22288 Performance degradation after upgrade... Closed
is duplicated by SERVER-22482 Cache growing to 100% followed by crash Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Steps To Reproduce:

1. Restore the attached dump on an instance using WiredTiger
2. Run the following query and verify it works fine

var userId = ObjectId('558c7b28ccb061d504e0692c');
var queryStart = ISODate('2016-01-14T09:00:00.000-0200');
var queryEnd = ISODate('2016-01-14T12:00:00.000-0200');
 
db.bug_test.count({ $or: [
    { userId: userId, end: { $gte: queryStart, $lt: queryEnd } },
    { userId: userId, start: { $gte: queryStart, $lt: queryEnd } },
    { userId: userId, start: { $lt: queryStart }, end: { $gt: queryEnd } }
]});

3. Create the following indexes:

db.bug_test.createIndex({ userId: 1, start: 1 });
db.bug_test.createIndex({ userId: 1, end: 1 });

4. Run the query used on step 2 again and the bug will be triggered

Sprint: Integration F (02/01/16)
Participants:
Case:

 Description   
Issue Status as of Jan 27, 2016

ISSUE SUMMARY
Operations where the predicate is a rooted $or use a special execution path in the query optimizer in order to perform query plan selection. A rooted $or is a predicate which consists exclusively of a top-level $or statement (although the subexpressions may be arbitrarily complex). For example,

{$or: [{a: 1}, {b: 1}]}

is a rooted $or. By contrast,

{$or: [{a: 1}, {b: 1}], c: 1}

is not a rooted $or due to the implicit AND with the predicate on the c field.

A bug in the plan selection path for such predicates may cause the server to improperly save index cursor state in preparation for yielding locks or to improperly restore index cursor state when re-acquiring locks after a yield.

USER IMPACT
The following commands, if using a rooted $or predicate, are affected:

  • aggregate
  • count
  • delete
  • distinct
  • find
  • findAndModify
  • group
  • mapReduce
  • update

For servers running the WiredTiger storage engine, this issue could cause poor performance due to suboptimal plan selection for a rooted $or query. The following log message is indicative of this issue:

WTIndex::updatePosition -- the new key (...) is less than the previous key (...), which is a bug

Servers running MMAPv1 may experience the following symptoms:

  • Poor performance due to suboptimal plan selection for a rooted $or query.
  • Failed rooted $or queries due to breaking internal invariants.
  • In rare cases, reads from freed memory or other undefined behavior may cause the server to abort.

WORKAROUNDS
The special execution path for rooted $or queries can be disabled at startup or at runtime via the internalQueryPlanOrChildrenIndependently setParameter. You can disable this parameter from the shell as shown below:

db.adminCommand({setParameter: 1, internalQueryPlanOrChildrenIndependently: false});

This configuration option should be set with care, as all rooted $or queries will be diverted from their typical execution path, potentially leading to a loss in performance for such queries.

AFFECTED VERSIONS
MongoDB 3.2.0 and 3.2.1.

FIX VERSION
The fix is included in the 3.2.3 release.

Original description

This causes messages such as save, restore, and invalidate to fail to propagate to the subplans if a yield occurs while choosing a plan for subqueries. This can result in incorrect query execution.

This was introduced by https://github.com/mongodb/mongo/commit/c832bc75#diff-d16091f48e3ffb1cc4c4cc6fd6319c2bR326.

Original Title and Description:

WTIndex::updatePosition - the new key is less than the previous key

When I use a query using compund indexes, the following message is shown several times per second on log:

2016-01-22T15:21:16.865-0200 I STORAGE  [conn2] WTIndex::updatePosition -- the new key ( 64558C7B28CCB061D504E068AB78800001523FD86A6C042087F1) is less than the previous key (64558C7B28CCB061D504E0692C78800001523FE4DFA40420EED9), which is a bug.

A dump to reproduce the bug is attached.

If I run the query without creating indexes, it works fine both on MMAPv1 and WiredTiger:

var userId = ObjectId('558c7b28ccb061d504e0692c');
var queryStart = ISODate('2016-01-14T09:00:00.000-0200');
var queryEnd = ISODate('2016-01-14T12:00:00.000-0200');
 
db.bug_test.count({ $or: [
    { userId: userId, end: { $gte: queryStart, $lt: queryEnd } },
    { userId: userId, start: { $gte: queryStart, $lt: queryEnd } },
    { userId: userId, start: { $lt: queryStart }, end: { $gt: queryEnd } }
]});

If I create the following indexes, it works fine on MMAPv1 but triggers the bug on WiredTiger:

db.bug_test.createIndex({ userId: 1, start: 1 });
db.bug_test.createIndex({ userId: 1, end: 1 });



 Comments   
Comment by Githook User [ 27/Jan/16 ]

Author:

{u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: SERVER-22279 SubplanStage must register its MultiPlanStage in _children

Failure to do this caused save() and restore() not to be passed through to
the MultiPlanStage (and ultimately the WiredTigerIndexCursor) when
MultiPlanStage::pickBestPlan yielded.

(cherry picked from commit 27e452997f8be0ee0f513f9f55e1ba4fad4bc027)
Branch: v3.2
https://github.com/mongodb/mongo/commit/b5e9e4ab04a25424c5c3fffac62756d9d0aaaa83

Comment by Githook User [ 26/Jan/16 ]

Author:

{u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: SERVER-22279 SubplanStage must register its MultiPlanStage in _children

Failure to do this caused save() and restore() not to be passed through to
the MultiPlanStage (and ultimately the WiredTigerIndexCursor) when
MultiPlanStage::pickBestPlan yielded.
Branch: master
https://github.com/mongodb/mongo/commit/27e452997f8be0ee0f513f9f55e1ba4fad4bc027

Comment by Marco Biscaro [ 22/Jan/16 ]

Curious... I hadn't noticed the huge number of yields. With MMAPv1, this number is pretty low, even on "cold" query (first query, right after booting the OS):

// cold query, numYields = 72
2016-01-22T17:40:10.341-0200 I COMMAND  [conn1] command test.bug_test command: count { count: "bug_test", query: { $or: [ { userId: ObjectId('558c7b28ccb061d504e0692c'), end: { $gte: new Date(1452769200000), $lt: new Date(1452780000000) } }, { userId: ObjectId('558c7b28ccb061d504e0692c'), start: { $gte: new Date(1452769200000), $lt: new Date(1452780000000) } }, { userId: ObjectId('558c7b28ccb061d504e0692c'), start: { $lt: new Date(1452769200000) }, end: { $gt: new Date(1452780000000) } } ] }, fields: {} } planSummary: IXSCAN { userId: 1.0, start: 1.0 }, IXSCAN { userId: 1.0, start: 1.0 }, IXSCAN { userId: 1.0, end: 1.0 } keyUpdates:0 writeConflicts:0 numYields:72 reslen:47 locks:{ Global: { acquireCount: { r: 146 } }, MMAPV1Journal: { acquireCount: { r: 73 } }, Database: { acquireCount: { r: 73 } }, Collection: { acquireCount: { R: 73 } } } protocol:op_command 342ms
 
// warm query, numYields = 21
2016-01-22T17:40:12.461-0200 I COMMAND  [conn1] command test.bug_test command: count { count: "bug_test", query: { $or: [ { userId: ObjectId('558c7b28ccb061d504e0692c'), end: { $gte: new Date(1452769200000), $lt: new Date(1452780000000) } }, { userId: ObjectId('558c7b28ccb061d504e0692c'), start: { $gte: new Date(1452769200000), $lt: new Date(1452780000000) } }, { userId: ObjectId('558c7b28ccb061d504e0692c'), start: { $lt: new Date(1452769200000) }, end: { $gt: new Date(1452780000000) } } ] }, fields: {} } planSummary: IXSCAN { userId: 1.0, start: 1.0 }, IXSCAN { userId: 1.0, start: 1.0 }, IXSCAN { userId: 1.0, end: 1.0 } keyUpdates:0 writeConflicts:0 numYields:21 reslen:47 locks:{ Global: { acquireCount: { r: 44 } }, MMAPV1Journal: { acquireCount: { r: 22 } }, Database: { acquireCount: { r: 22 } }, Collection: { acquireCount: { R: 22 } } } protocol:op_command 9ms

Comment by Scott Hernandez (Inactive) [ 22/Jan/16 ]

Marco, I see that when these log messages happen (probably one for each yield/wce) it also slows down the operation, due to restarting the index seek/next:

// Bad run
2016-01-22T13:19:46.418-0500 I COMMAND  [conn1] command test.bug_test command: count { count: "bug_test", query: { $or: [ { userId: ObjectId('558c7b28ccb061d504e0692c'), end: { $gte: new Date(1452769200000), $lt: new Date(1452780000000) } }, { userId: ObjectId('558c7b28ccb061d504e0692c'), start: { $gte: new Date(1452769200000), $lt: new Date(1452780000000) } }, { userId: ObjectId('558c7b28ccb061d504e0692c'), start: { $lt: new Date(1452769200000) }, end: { $gt: new Date(1452780000000) } } ] }, fields: {} } planSummary: IXSCAN { userId: 1.0, start: 1.0 }, IXSCAN { userId: 1.0, end: 1.0 }, IXSCAN { userId: 1.0, end: 1.0 } keyUpdates:0 writeConflicts:0 numYields:59687 reslen:47 locks:{ Global: { acquireCount: { r: 119376 } }, Database: { acquireCount: { r: 59688 } }, Collection: { acquireCount: { r: 59688 } } } protocol:op_command 1606ms
 
//Good run, immediately after bad run
2016-01-22T13:24:03.105-0500 I COMMAND  [conn1] command test.bug_test command: count { count: "bug_test", query: { $or: [ { userId: ObjectId('558c7b28ccb061d504e0692c'), end: { $gte: new Date(1452769200000), $lt: new Date(1452780000000) } }, { userId: ObjectId('558c7b28ccb061d504e0692c'), start: { $gte: new Date(1452769200000), $lt: new Date(1452780000000) } }, { userId: ObjectId('558c7b28ccb061d504e0692c'), start: { $lt: new Date(1452769200000) }, end: { $gt: new Date(1452780000000) } } ] }, fields: {} } planSummary: IXSCAN { userId: 1.0, start: 1.0 }, IXSCAN { userId: 1.0, end: 1.0 }, IXSCAN { userId: 1.0, end: 1.0 } keyUpdates:0 writeConflicts:0 numYields:20 reslen:47 locks:{ Global: { acquireCount: { r: 42 } }, Database: { acquireCount: { r: 21 } }, Collection: { acquireCount: { r: 21 } } } protocol:op_command 14ms

Also interesting: after the first run, it is fast and doesn't log any messages. And, when the server is restarted it is slow twice, not once like after the index is created, before it become fast again.

Comment by Marco Biscaro [ 22/Jan/16 ]

Hi Scott.

The only evidence I could find of the problem are the messages in the log, but the query hangs until a lot of messages are printed (I guess it prints this message for each document in the collection).

When I run this query in a collection with millions of documents, the server hangs for minutes, while the messages are printed. Eventually, the correct result is returned.

Comment by Scott Hernandez (Inactive) [ 22/Jan/16 ]

Marco, we will try to reproduce this as well. Thanks for the reproduction steps and data.

Other than the informational message in the logs, is there any problem affecting your client or application?

Generated at Thu Feb 08 03:59:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.