[SERVER-79244] Text search with relevance sort consumes all memory and crashes the machine Created: 24/Jul/23  Updated: 27/Jul/23

Status: Waiting For User Input
Project: Core Server
Component/s: Text Search
Affects Version/s: 5.0.18
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Josef Sábl Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-36087 Executing $text statements in conjunc... Closed
Related
related to SERVER-26534 Text search uses excessive memory Backlog
Operating System: ALL
Participants:

 Description   

I have a collection of around 3m rather small documents. Total size of collection is <1GB.

If I do a text search for a word that is present in many (~1.5m) documents (i.e. it finds almost all the documents) it takes some time but I get the results.

If I also add `{ $sort:

{ $meta: 'textScore' }

}` to sort by relevance it starts to consume memory like crazy effectively crashing the machine almost instantly.

What may be wrong? Where to look?

An interesting point: The whole database I am talking about would fit into our memory three fold.

We were monitoring the memory consumption when the problem occurs. The system had 58% percent free memory one second. It dropped to 10 % free memory in one second. And than it crashed.

Our version is 5.0.18



 Comments   
Comment by Yuan Fang [ 27/Jul/23 ]

Hi josef.sabl@gmail.com,

Thank you for your report. I understand that the text search query with sort failed due to running out of memory. We suspect this issue may be related to SERVER-26534, but we need more data to confirm: 

I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time.

Would you be able to do the following:

  • restart mongod with --setParameter heapProfilingEnabled=true
  • run the failing query
  • repeat the preceding on the other failing queries
  • then upload diagnostic.data and the complete mongo log files covering the test (the latter has crucial information about the allocating stacks recorded by the former). 

Also, can you tell us the rough size of the text-indexed field?

Regards,
Yuan
 

Generated at Thu Feb 08 06:40:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.