Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Critical - P2
Fix Version/s: 3.0.5
Affects Version/s: 3.0.1, 3.0.4
Component/s: Text Search, WiredTiger
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
Quint Iteration 5
Confidence Status:
None
Work Order:
0
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Issue Status as of Jul 14, 2015

ISSUE SUMMARY
Long-running queries in MongoDB periodically yield their locks. As part of the yield-preparation procedure, intermediate results buffered in memory by the storage engine may need to be processed in order to ensure that there are no references into the storage layer that may become invalid when locks are relinquished.

A bug in this procedure may make $text and geoNear (i.e. $near or $nearSphere) long-running queries, which buffer intermediate query results, execute slowly. In particular, if a such a query yields y times and buffers d documents, the overall time spent in yield-preparation was O(yd). With the fix, the time complexity is reduced to O(y).

This issue only appears when queries are performed against a mongod instance running with the WiredTiger storage engine. Instances running the MMAPv1 storage engine are not affected.

USER IMPACT
On MongoDB systems using the WiredTiger storage engine, queries using $text or geoNear (i.e. $near or $nearSphere) may perform poorly. The performance impact is most severe for "large" $text or geoNear queries, i.e. queries that return a lot of results or require examining a large number of index keys or documents.

AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.4 running with the WiredTiger storage engine.

FIX VERSION
The fix is included in the 3.0.5 production release.

Original description

Created db with 5M docs full-text indexed as follows:

words = [
    " when", " in", " the", " course", " of", " human", " events", " it", " becomes", " necessary",
    " four", " score", " and", " seven", " years", " ago",
    " ask", " not", " what", " your", " country", " can", " do", " for", " you",
    " that's", " one", " small", " step", " for", " a", " man",
]

function sentence() {
    l = Math.random() * 20
    s = " "
    for (var i=0; i<l; i++) {
        w = Math.floor(Math.random() * words.length)
        s = s + words[w]
    }
    return s
}

function init() {
    db.c.drop()
    db.c.createIndex({x: "text"})
}

function create() {
    count = 5000000
    every = 10000
    for (var i=0; i<count; ) {
        var bulk = db.c.initializeUnorderedBulkOp();
        for (var j=0; j<every; j++, i++)
            bulk.insert({x:sentence()})
        bulk.execute();
        print(i)
    }
}

Then do a full-text search:

db.c.find({$text: {$search: "necessary"}}).itcount()

Finishes in about 10 seconds under mmapv1, ran for a several minutes without finishing under WT before I killed it. It also uses (possibly a lot) more memory under WT than mmapv1.

It seems to be spending all its time in forceFetchAllLocs (itself, not callees), called from yield. This path is only exercised if storage engine has document-level locking, which explains why WT behaves differently than mmapv1.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

stacks.png
191 kB
Jun 11 2015 04:40:48 PM UTC

is related to

SERVER-18961 Avoid iterating the entire working set on every yield

Closed

related to

SERVER-19489 Assertion failure and segfault in WorkingSet::free in 3.0.5-rc0

Closed

SERVER-26534 Text search uses excessive memory

Closed

Assignee:: David Storch
Reporter:: Bruce Lucas (Inactive)
Participants:: Bruce Lucas, Dave Withers, David Storch, Githook User, J Rassi
Votes:: 0 Vote for this issue
Watchers:: 57 Start watching this issue

Created:: Jun 11 2015 04:36:08 PM UTC
Updated:: Oct 08 2016 01:34:20 PM UTC
Resolved:: Jun 17 2015 04:42:45 PM UTC

Details

Description

Original description

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates