[SERVER-36087] Executing $text statements in conjunction with 'count' or 'sort' provokes Out Of Memory Created: 12/Jul/18  Updated: 27/Jul/23  Resolved: 27/Aug/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.6.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Andreas Estermann Assignee: David Storch
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File diagnostics.tar.gz     File metrics.2018-08-13T09-48-27Z-00000     File metrics.2018-08-13T13-11-05Z-00000     File metrics.2018-08-13T14-39-37Z-00000     File metrics.interim     File metrics.interim     Text File mongod.log     Text File mongod.log    
Issue Links:
Duplicate
duplicates SERVER-24375 Deduping in OR, SORT_MERGE, and IXSCA... Backlog
is duplicated by SERVER-79244 Text search with relevance sort consu... Waiting For User Input
Related
related to SERVER-36794 Non-blocking $text plans with just on... Closed
is related to SERVER-26534 Text search uses excessive memory Backlog
Operating System: ALL
Steps To Reproduce:

1) Create a database with 1 collection

2) Insert 20 million documents in that collection

3) documents contain:

  • ID field
  • Text field (with a text index) 'text'
  • Date field (with an index) 'start'

4) Text field has the value 'andreas.estermann@dynatrace.com' for all documents. Date field values are distributed over time.

5) Issue the following queries in the mongodb shell:

db.collection.find({"text": "andreas.estermann@dynatrace.com"})
db.collection.find({"text": "andreas.estermann@dynatrace.com"}).count()
db.collection.find({"text": "andreas.estermann@dynatrace.com"}).limit(25)
db.collection.find({"text": "andreas.estermann@dynatrace.com"}).limit(25).sort({"start": -1})
db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }})
db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).count()
db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25)
db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25).sort({"start": -1})

6) Find the following results for the queries

db.collection.find({"text": "andreas.estermann@dynatrace.com"})

-> executes within milliseconds, result correct, expected

db.collection.find({"text": "andreas.estermann@dynatrace.com"}).count()

-> executes within about 1 minute, result correct, expected

db.collection.find({"text": "andreas.estermann@dynatrace.com"}).limit(25)

-> executes within milliseconds, result correct, expected

db.collection.find({"text": "andreas.estermann@dynatrace.com"}).limit(25).sort({"start": -1})

-> executes within milliseconds, result correct, expected

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }})

-> executes within milliseconds, result correct, expected

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).count()

-> crashes server because of excessive memory consumption, not expected

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25)

-> executes within milliseconds, result correct, expected

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25).sort({"start": -1})

-> crashes server because of excessive memory consumption, not expected

Participants:

 Description   

Executing a full text search and then executing additional functions on the result set seems to result in MongoDB using all the server memory which ultimately leads to a crash.

The machine used to reproduce the issue only has 4 GB of memory. However, the same behavior can be observed on our production machines that have a lot more memory. It just takes longer (up to 40 minutes) until the memory is used up and the server crashes.



 Comments   
Comment by David Storch [ 27/Aug/18 ]

Thanks estermann@qumram.com. The additional data you have provided confirms that this is a duplicate of SERVER-24375. Please watch and vote for this ticket, as well as SERVER-36794 which may help reduce memory consumption for some $text queries.

The reason that some of your queries provoke a crash due to OOM, but not others, is that only some of these example queries require the full set of matching index keys to be scanned and deduplicated. The queries without the $text predicate are executing as collection scans, so no deduping is necessary. That explains why only $text queries are crashing due to memory consumption. The $text query with a .limit() but not a .sort() can stop scanning the index as soon as the limit is satisfied. This partial scan means that the set of in-memory RecordIds doesn't grow very large, so the query can complete successfully. The $text query with no limit and no sort, on the other hand, has to scan the entire index, and thus its in-memory set of RecordIds becomes too large. The same is true for the .count() variant and the .sort({start: -1}).limit(25) variant, as the sort requires us to look at all matching documents in order to be sure that we are returning the 25 smallest.

I am closing this ticket as a duplicate of SERVER-24375. Please let me know if you have any further questions.

Best,
Dave

Comment by Andreas Estermann [ 24/Aug/18 ]

Here are the results I have been able to get:

1)

Can you confirm that

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).itcount();

does not crash the server, and provide the resulting count? Please note that this is .itcount() as opposed to .count().

No, this query does crash the server as well

2)

Please provide the output of

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).count();

-> Crash

3)

For each of the queries you describe above:


db.collections.find({"text": "andreas.estermann@dynatrace.com"}).explain("allPlansExecution")

{
 "queryPlanner" : {
 "plannerVersion" : 1,
 "namespace" : "db.collections",
 "indexFilterSet" : false,
 "parsedQuery" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "winningPlan" : {
 "stage" : "COLLSCAN",
 "filter" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "direction" : "forward"
 },
 "rejectedPlans" : [ ]
 },
 "executionStats" : {
 "executionSuccess" : true,
 "nReturned" : 20000008,
 "executionTimeMillis" : 310389,
 "totalKeysExamined" : 0,
 "totalDocsExamined" : 20000008,
 "executionStages" : {
 "stage" : "COLLSCAN",
 "filter" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "nReturned" : 20000008,
 "executionTimeMillisEstimate" : 304871,
 "works" : 20000010,
 "advanced" : 20000008,
 "needTime" : 1,
 "needYield" : 0,
 "saveState" : 158238,
 "restoreState" : 158238,
 "isEOF" : 1,
 "invalidates" : 0,
 "direction" : "forward",
 "docsExamined" : 20000008
 },
 "allPlansExecution" : [ ]
 },
 "serverInfo" : {
 "host" : "ubuntu-demo-mongodb.qinfra.ch",
 "port" : 27017,
 "version" : "3.6.6",
 "gitVersion" : "6405d65b1d6432e138b44c13085d0c2fe235d6bd"
 },
 "ok" : 1
}

 

db.collections.find({"text": "andreas.estermann@dynatrace.com"}).limit(25).explain("allPlansExecution")

{
 "queryPlanner" : {
 "plannerVersion" : 1,
 "namespace" : "db.collections",
 "indexFilterSet" : false,
 "parsedQuery" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "winningPlan" : {
 "stage" : "LIMIT",
 "limitAmount" : 25,
 "inputStage" : {
 "stage" : "COLLSCAN",
 "filter" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "direction" : "forward"
 }
 },
 "rejectedPlans" : [ ]
 },
 "executionStats" : {
 "executionSuccess" : true,
 "nReturned" : 25,
 "executionTimeMillis" : 0,
 "totalKeysExamined" : 0,
 "totalDocsExamined" : 25,
 "executionStages" : {
 "stage" : "LIMIT",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 0,
 "works" : 27,
 "advanced" : 25,
 "needTime" : 1,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 1,
 "invalidates" : 0,
 "limitAmount" : 25,
 "inputStage" : {
 "stage" : "COLLSCAN",
 "filter" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 0,
 "works" : 26,
 "advanced" : 25,
 "needTime" : 1,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "direction" : "forward",
 "docsExamined" : 25
 }
 },
 "allPlansExecution" : [ ]
 },
 "serverInfo" : {
 "host" : "ubuntu-demo-mongodb.qinfra.ch",
 "port" : 27017,
 "version" : "3.6.6",
 "gitVersion" : "6405d65b1d6432e138b44c13085d0c2fe235d6bd"
 },
 "ok" : 1
}

db.collections.find({"text": "andreas.estermann@dynatrace.com"}).limit(25).sort({"start": -1}).explain("allPlansExecution")

{
 "queryPlanner" : {
 "plannerVersion" : 1,
 "namespace" : "db.collections",
 "indexFilterSet" : false,
 "parsedQuery" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "winningPlan" : {
 "stage" : "LIMIT",
 "limitAmount" : 25,
 "inputStage" : {
 "stage" : "FETCH",
 "filter" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "inputStage" : {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "start" : 1
 },
 "indexName" : "start_1",
 "isMultiKey" : false,
 "multiKeyPaths" : {
 "start" : [ ]
 },
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 "start" : [
 "[MaxKey, MinKey]"
 ]
 }
 }
 }
 },
 "rejectedPlans" : [ ]
 },
 "executionStats" : {
 "executionSuccess" : true,
 "nReturned" : 25,
 "executionTimeMillis" : 56,
 "totalKeysExamined" : 25,
 "totalDocsExamined" : 25,
 "executionStages" : {
 "stage" : "LIMIT",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 55,
 "works" : 26,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 1,
 "restoreState" : 1,
 "isEOF" : 1,
 "invalidates" : 0,
 "limitAmount" : 25,
 "inputStage" : {
 "stage" : "FETCH",
 "filter" : {
 "text" : {
 "$eq" : "andreas.estermann@dynatrace.com"
 }
 },
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 55,
 "works" : 25,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 1,
 "restoreState" : 1,
 "isEOF" : 0,
 "invalidates" : 0,
 "docsExamined" : 25,
 "alreadyHasObj" : 0,
 "inputStage" : {
 "stage" : "IXSCAN",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 55,
 "works" : 25,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 1,
 "restoreState" : 1,
 "isEOF" : 0,
 "invalidates" : 0,
 "keyPattern" : {
 "start" : 1
 },
 "indexName" : "start_1",
 "isMultiKey" : false,
 "multiKeyPaths" : {
 "start" : [ ]
 },
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 "start" : [
 "[MaxKey, MinKey]"
 ]
 },
 "keysExamined" : 25,
 "seeks" : 1,
 "dupsTested" : 0,
 "dupsDropped" : 0,
 "seenInvalidated" : 0
 }
 }
 },
 "allPlansExecution" : [ ]
 },
 "serverInfo" : {
 "host" : "ubuntu-demo-mongodb.qinfra.ch",
 "port" : 27017,
 "version" : "3.6.6",
 "gitVersion" : "6405d65b1d6432e138b44c13085d0c2fe235d6bd"
 },
 "ok" : 1
}

db.collections.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).explain("allPlansExecution")

-> Crash

db.collections.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).explain("queryPlanner")

{
 "queryPlanner" : {
 "plannerVersion" : 1,
 "namespace" : "db.collections",
 "indexFilterSet" : false,
 "parsedQuery" : {
 "$text" : {
 "$search" : "\"andreas.estermann@dynatrace.com\"",
 "$language" : "english",
 "$caseSensitive" : false,
 "$diacriticSensitive" : false
 }
 },
 "winningPlan" : {
 "stage" : "TEXT",
 "indexPrefix" : {
 
},
 "indexName" : "ix_full_text",
 "parsedTextQuery" : {
 "terms" : [
 "andrea",
 "com",
 "dynatrac",
 "estermann"
 ],
 "negatedTerms" : [ ],
 "phrases" : [
 "andreas.estermann@dynatrace.com"
 ],
 "negatedPhrases" : [ ]
 },
 "textIndexVersion" : 3,
 "inputStage" : {
 "stage" : "TEXT_MATCH",
 "inputStage" : {
 "stage" : "FETCH",
 "inputStage" : {
 "stage" : "OR",
 "inputStages" : [
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 }
 ]
 }
 }
 }
 },
 "rejectedPlans" : [ ]
 },
 "serverInfo" : {
 "host" : "ubuntu-demo-mongodb.qinfra.ch",
 "port" : 27017,
 "version" : "3.6.6",
 "gitVersion" : "6405d65b1d6432e138b44c13085d0c2fe235d6bd"
 },
 "ok" : 1
}

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25).explain("allPlansExecution")

{
 "queryPlanner" : {
 "plannerVersion" : 1,
 "namespace" : "db.collections",
 "indexFilterSet" : false,
 "parsedQuery" : {
 "$text" : {
 "$search" : "\"andreas.estermann@dynatrace.com\"",
 "$language" : "english",
 "$caseSensitive" : false,
 "$diacriticSensitive" : false
 }
 },
 "winningPlan" : {
 "stage" : "LIMIT",
 "limitAmount" : 25,
 "inputStage" : {
 "stage" : "TEXT",
 "indexPrefix" : {
 
},
 "indexName" : "ix_full_text",
 "parsedTextQuery" : {
 "terms" : [
 "andrea",
 "com",
 "dynatrac",
 "estermann"
 ],
 "negatedTerms" : [ ],
 "phrases" : [
 "andreas.estermann@dynatrace.com"
 ],
 "negatedPhrases" : [ ]
 },
 "textIndexVersion" : 3,
 "inputStage" : {
 "stage" : "TEXT_MATCH",
 "inputStage" : {
 "stage" : "FETCH",
 "inputStage" : {
 "stage" : "OR",
 "inputStages" : [
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 }
 ]
 }
 }
 }
 }
 },
 "rejectedPlans" : [ ]
 },
 "executionStats" : {
 "executionSuccess" : true,
 "nReturned" : 25,
 "executionTimeMillis" : 13,
 "totalKeysExamined" : 25,
 "totalDocsExamined" : 25,
 "executionStages" : {
 "stage" : "LIMIT",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 10,
 "works" : 26,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 1,
 "invalidates" : 0,
 "limitAmount" : 25,
 "inputStage" : {
 "stage" : "TEXT",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 10,
 "works" : 25,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "indexPrefix" : {
 
},
 "indexName" : "ix_full_text",
 "parsedTextQuery" : {
 "terms" : [
 "andrea",
 "com",
 "dynatrac",
 "estermann"
 ],
 "negatedTerms" : [ ],
 "phrases" : [
 "andreas.estermann@dynatrace.com"
 ],
 "negatedPhrases" : [ ]
 },
 "textIndexVersion" : 3,
 "inputStage" : {
 "stage" : "TEXT_MATCH",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 10,
 "works" : 25,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "docsRejected" : 0,
 "inputStage" : {
 "stage" : "FETCH",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 10,
 "works" : 25,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "docsExamined" : 25,
 "alreadyHasObj" : 0,
 "inputStage" : {
 "stage" : "OR",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 10,
 "works" : 25,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "dupsTested" : 25,
 "dupsDropped" : 0,
 "recordIdsForgotten" : 0,
 "inputStages" : [
 {
 "stage" : "IXSCAN",
 "nReturned" : 25,
 "executionTimeMillisEstimate" : 10,
 "works" : 25,
 "advanced" : 25,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
},
 "keysExamined" : 25,
 "seeks" : 1,
 "dupsTested" : 25,
 "dupsDropped" : 0,
 "seenInvalidated" : 0
 },
 {
 "stage" : "IXSCAN",
 "nReturned" : 0,
 "executionTimeMillisEstimate" : 0,
 "works" : 0,
 "advanced" : 0,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
},
 "keysExamined" : 0,
 "seeks" : 0,
 "dupsTested" : 0,
 "dupsDropped" : 0,
 "seenInvalidated" : 0
 },
 {
 "stage" : "IXSCAN",
 "nReturned" : 0,
 "executionTimeMillisEstimate" : 0,
 "works" : 0,
 "advanced" : 0,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
},
 "keysExamined" : 0,
 "seeks" : 0,
 "dupsTested" : 0,
 "dupsDropped" : 0,
 "seenInvalidated" : 0
 },
 {
 "stage" : "IXSCAN",
 "nReturned" : 0,
 "executionTimeMillisEstimate" : 0,
 "works" : 0,
 "advanced" : 0,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 0,
 "restoreState" : 0,
 "isEOF" : 0,
 "invalidates" : 0,
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
},
 "keysExamined" : 0,
 "seeks" : 0,
 "dupsTested" : 0,
 "dupsDropped" : 0,
 "seenInvalidated" : 0
 }
 ]
 }
 }
 }
 }
 },
 "allPlansExecution" : [ ]
 },
 "serverInfo" : {
 "host" : "ubuntu-demo-mongodb.qinfra.ch",
 "port" : 27017,
 "version" : "3.6.6",
 "gitVersion" : "6405d65b1d6432e138b44c13085d0c2fe235d6bd"
 },
 "ok" : 1
}

db.collections.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25).sort({"start": -1}).explain("queryPlanner")

{
 "queryPlanner" : {
 "plannerVersion" : 1,
 "namespace" : "db.collections",
 "indexFilterSet" : false,
 "parsedQuery" : {
 "$text" : {
 "$search" : "\"andreas.estermann@dynatrace.com\"",
 "$language" : "english",
 "$caseSensitive" : false,
 "$diacriticSensitive" : false
 }
 },
 "winningPlan" : {
 "stage" : "SORT",
 "sortPattern" : {
 "start" : -1
 },
 "limitAmount" : 25,
 "inputStage" : {
 "stage" : "SORT_KEY_GENERATOR",
 "inputStage" : {
 "stage" : "TEXT",
 "indexPrefix" : {
 
},
 "indexName" : "ix_full_text",
 "parsedTextQuery" : {
 "terms" : [
 "andrea",
 "com",
 "dynatrac",
 "estermann"
 ],
 "negatedTerms" : [ ],
 "phrases" : [
 "andreas.estermann@dynatrace.com"
 ],
 "negatedPhrases" : [ ]
 },
 "textIndexVersion" : 3,
 "inputStage" : {
 "stage" : "TEXT_MATCH",
 "inputStage" : {
 "stage" : "FETCH",
 "inputStage" : {
 "stage" : "OR",
 "inputStages" : [
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 },
 {
 "stage" : "IXSCAN",
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "ix_full_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
}
 }
 ]
 }
 }
 }
 }
 }
 },
 "rejectedPlans" : [ ]
 },
 "serverInfo" : {
 "host" : "ubuntu-demo-mongodb.qinfra.ch",
 "port" : 27017,
 "version" : "3.6.6",
 "gitVersion" : "6405d65b1d6432e138b44c13085d0c2fe235d6bd"
 },
 "ok" : 1
}

db.collections.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25).sort({"start": -1}).explain("allPlansExecution")

-> Crash

Comment by David Storch [ 21/Aug/18 ]

Hi estermann@qumram.com,

Thanks for your patience while we reviewed this case further. The heap profiles you provided suggest that this is a manifestation of SERVER-24375. Internally, the $text query has an OR stage and a multikey IXSCAN stage which are both keeping sets of RecordIds in memory for deduplication. This is necessary in order to prevent the system from returning the same document twice as part of the query result set. I observe, however, that the OR can actually be optimized away in your case: see SERVER-36794. Please watch and vote for SERVER-36794, as this will help to reduce the amount of memory consumed by your problematic queries.

There are still a few aspects of this case which I'd like to dig into further, so I'm interested in a few more pieces of information:

1) Can you confirm that

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).itcount();

does not crash the server, and provide the resulting count? Please note that this is .itcount() as opposed to .count().

2) Please provide the output of

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).count();

3) For each of the queries you describe above:

db.collection.find({"text": "andreas.estermann@dynatrace.com"})
 
-> executes within milliseconds, result correct, expected
 
db.collection.find({"text": "andreas.estermann@dynatrace.com"}).count()
 
-> executes within about 1 minute, result correct, expected
 
db.collection.find({"text": "andreas.estermann@dynatrace.com"}).limit(25)
 
-> executes within milliseconds, result correct, expected
 
db.collection.find({"text": "andreas.estermann@dynatrace.com"}).limit(25).sort({"start": -1})
 
-> executes within milliseconds, result correct, expected
 
db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }})
 
-> executes within milliseconds, result correct, expected
 
db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).count()
 
-> crashes server because of excessive memory consumption, not expected
 
db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25)
 
-> executes within milliseconds, result correct, expected
 
db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).limit(25).sort({"start": -1})
 
-> crashes server because of excessive memory consumption, not expected

can you provide the following:

Thanks, and let me know if you have any questions about the above.

Best,
Dave

Comment by Andreas Estermann [ 13/Aug/18 ]

I have attached the new files.

Comment by Bruce Lucas (Inactive) [ 13/Aug/18 ]

Hi Andreas,

Thanks for uploading the data. Unfortunately it appears that the heap profiler was not enabled during this run, confirmed by the startup options in the mongod log:

2018-08-13T11:47:21.645+0200 I CONTROL  [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "0.0.0.0", port: 27017 }, processManagement: { pidFilePath: "/var/run/mongodb/mongod.pid" }, storage: { dbPath: "/data/db", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } }

Can you please try running again ensuring that --setParameter heapProfilingEnabled=true is specified on the command line or the following is added to your conf file:

setParameter:
    heapProfilingEnabled: true

Thanks,
Bruce

 

Comment by Andreas Estermann [ 13/Aug/18 ]

I have restarted mongod with the parameter you mentioned and ran the query 

db.collection.find({$text: {$search: "\"andreas.estermann@dynatrace.com\"" }}).count()

-> crashes server because of excessive memory consumption, not expected

You can find the requested files attached.

With regard to your other questions:

{
"ns" : "db.collection",
"size" : 7986674970,
"count" : 20000008,
"avgObjSize" : 399,
"storageSize" : 2026397696,
"capped" : false,
"wiredTiger" : {
"metadata" :

{ "formatVersion" : 1 }

,
"creationString" : "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=1),assert=(commit_timestamp=none,read_timestamp=none),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_custom=(prefix=,start_generation=0,suffix=),merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
"type" : "file",
"uri" : "statistics:table:collection-174--5368162830630569099",
"LSM" :

{ "bloom filter false positives" : 0, "bloom filter hits" : 0, "bloom filter misses" : 0, "bloom filter pages evicted from cache" : 0, "bloom filter pages read into cache" : 0, "bloom filters in the LSM tree" : 0, "chunks in the LSM tree" : 0, "highest merge generation in the LSM tree" : 0, "queries that could have benefited from a Bloom filter that did not exist" : 0, "sleep for LSM checkpoint throttle" : 0, "sleep for LSM merge throttle" : 0, "total size of bloom filters" : 0 }

,
"block-manager" :

{ "allocations requiring file extension" : 0, "blocks allocated" : 0, "blocks freed" : 0, "checkpoint size" : 2018783232, "file allocation unit size" : 4096, "file bytes available for reuse" : 7598080, "file magic number" : 120897, "file major version number" : 1, "file size in bytes" : 2026397696, "minor version number" : 0 }

,
"btree" :

{ "btree checkpoint generation" : 4, "column-store fixed-size leaf pages" : 0, "column-store internal pages" : 0, "column-store variable-size RLE encoded values" : 0, "column-store variable-size deleted values" : 0, "column-store variable-size leaf pages" : 0, "fixed-record size" : 0, "maximum internal page key size" : 368, "maximum internal page size" : 4096, "maximum leaf page key size" : 2867, "maximum leaf page size" : 32768, "maximum leaf page value size" : 67108864, "maximum tree depth" : 0, "number of key/value pairs" : 0, "overflow pages" : 0, "pages rewritten by compaction" : 0, "row-store internal pages" : 0, "row-store leaf pages" : 0 }

,
"cache" :

{ "bytes currently in the cache" : 48900, "bytes read into cache" : 23486, "bytes written from cache" : 0, "checkpoint blocked page eviction" : 0, "data source pages selected for eviction unable to be evicted" : 0, "eviction walk passes of a file" : 0, "eviction walk target pages histogram - 0-9" : 0, "eviction walk target pages histogram - 10-31" : 0, "eviction walk target pages histogram - 128 and higher" : 0, "eviction walk target pages histogram - 32-63" : 0, "eviction walk target pages histogram - 64-128" : 0, "eviction walks abandoned" : 0, "eviction walks gave up because they restarted their walk twice" : 0, "eviction walks gave up because they saw too many pages and found no candidates" : 0, "eviction walks gave up because they saw too many pages and found too few candidates" : 0, "eviction walks reached end of tree" : 0, "eviction walks started from root of tree" : 0, "eviction walks started from saved location in tree" : 0, "hazard pointer blocked page eviction" : 0, "in-memory page passed criteria to be split" : 0, "in-memory page splits" : 0, "internal pages evicted" : 0, "internal pages split during eviction" : 0, "leaf pages split during eviction" : 0, "modified pages evicted" : 0, "overflow pages read into cache" : 0, "page split during eviction deepened the tree" : 0, "page written requiring lookaside records" : 0, "pages read into cache" : 4, "pages read into cache after truncate" : 0, "pages read into cache after truncate in prepare state" : 0, "pages read into cache requiring lookaside entries" : 0, "pages requested from the cache" : 3, "pages seen by eviction walk" : 0, "pages written from cache" : 0, "pages written requiring in-memory restoration" : 0, "tracked dirty bytes in the cache" : 0, "unmodified pages evicted" : 0 }

,
"cache_walk" :

{ "Average difference between current eviction generation when the page was last considered" : 0, "Average on-disk page image size seen" : 0, "Average time in cache for pages that have been visited by the eviction server" : 0, "Average time in cache for pages that have not been visited by the eviction server" : 0, "Clean pages currently in cache" : 0, "Current eviction generation" : 0, "Dirty pages currently in cache" : 0, "Entries in the root page" : 0, "Internal pages currently in cache" : 0, "Leaf pages currently in cache" : 0, "Maximum difference between current eviction generation when the page was last considered" : 0, "Maximum page size seen" : 0, "Minimum on-disk page image size seen" : 0, "Number of pages never visited by eviction server" : 0, "On-disk page image sizes smaller than a single allocation unit" : 0, "Pages created in memory and never written" : 0, "Pages currently queued for eviction" : 0, "Pages that could not be queued for eviction" : 0, "Refs skipped during cache traversal" : 0, "Size of the root page" : 0, "Total number of pages currently in cache" : 0 }

,
"compression" :

{ "compressed pages read" : 1, "compressed pages written" : 0, "page written failed to compress" : 0, "page written was too small to compress" : 0, "raw compression call failed, additional data available" : 0, "raw compression call failed, no additional data available" : 0, "raw compression call succeeded" : 0 }

,
"cursor" :

{ "bulk-loaded cursor-insert calls" : 0, "create calls" : 1, "cursor-insert key and value bytes inserted" : 0, "cursor-remove key bytes removed" : 0, "cursor-update value bytes updated" : 0, "cursors cached on close" : 0, "cursors reused from cache" : 0, "insert calls" : 0, "modify calls" : 0, "next calls" : 0, "prev calls" : 1, "remove calls" : 0, "reserve calls" : 0, "reset calls" : 1, "restarted searches" : 0, "search calls" : 0, "search near calls" : 0, "truncate calls" : 0, "update calls" : 0 }

,
"reconciliation" :

{ "dictionary matches" : 0, "fast-path pages deleted" : 0, "internal page key bytes discarded using suffix compression" : 0, "internal page multi-block writes" : 0, "internal-page overflow keys" : 0, "leaf page key bytes discarded using prefix compression" : 0, "leaf page multi-block writes" : 0, "leaf-page overflow keys" : 0, "maximum blocks required for a page" : 0, "overflow values written" : 0, "page checksum matches" : 0, "page reconciliation calls" : 0, "page reconciliation calls for eviction" : 0, "pages deleted" : 0 }

,
"session" :

{ "cached cursor count" : 0, "object compaction" : 0, "open cursor count" : 0 }

,
"transaction" :

{ "update conflicts" : 0 }

},
"nindexes" : 15,
"totalIndexSize" : 2074185728,
"indexSizes" :

{ "_id_" : 221908992, "start_1" : 163266560, "end_1" : 163831808, "labels_1" : 215367680, "errorLabels_1" : 16384, "hasError_1" : 79753216, "duration_1" : 112549888, "customLabels_1" : 16384, "ix_touched_state" : 80027648, "ix_lifecycle_removal_state" : 80015360, "ix_lifecycle_cache_removal_state" : 79953920, "week_1" : 112578560, "allLabels_1" : 224468992, "ix_state_touched" : 90206208, "ix_full_text" : 450224128 }

,
"ok" : 1
}

Comment by Andreas Estermann [ 24/Jul/18 ]

Hi Bruce,

Thanks for the feedback. Please note that I am on a business trip until Friday. I will try to get the data as quickly as possible but my access to the system is limited those days.

Andreas

Comment by Bruce Lucas (Inactive) [ 23/Jul/18 ]

Hi Andreas,

Thanks for uploading the data. Unfortunately it by itself is inconclusive as to the exact cause.

Would you be able to do the following:

  • restart mongod with --setParameter heapProfilingEnabled=true
  • run the failing query
  • repeat the preceding on the other failing queries
  • then upload diagnostic.data and the complete mongo log files covering the test (the latter has crucial information about the allocating stacks recorded by the former).

Also, can you tell us how big the collection is (number of documents and number of bytes), and the rough size of the text-indexed field?

Thanks,
Bruce

Comment by Andreas Estermann [ 17/Jul/18 ]

See attachment. It should contain the relevant parts.

Comment by Ramon Fernandez Marina [ 16/Jul/18 ]

estermann@qumram.com, this may be the same issue reported in SERVER-26534. Will you please upload the contents of the diagnostic.data directory from a node where you've experienced this behavior so we can determine whether it's the same issue or a different one?

Thanks,
Ramón.

Generated at Thu Feb 08 04:41:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.