[SERVER-35892] performance regression with lookahead regex Created: 28/Jun/18 Updated: 04/Nov/18 Resolved: 01/Oct/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Roger Gonzalez | Assignee: | Nick Brewer |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Steps To Reproduce: |
|
| Participants: |
| Description |
|
We recently upgraded our production servers from 3.2 to 3.6, and started noticing huge CPU spikes and long transactions (60s+) on code that used to not cause issues. Backing collection has about 2000 documents. The (indexed) description field is a block of up to 512 characters at most. The normal query has some other filters that narrow the matching set down to about 800 documents, and then this clause is the critical feature (driven by incremental search from web clients):
We add more terms as the user types them. In our old 3.2 mongo, this query took about 30ms at most. On 3.6, just two terms takes over 30000ms, and three terms start to be over 70000ms. It's worst when the first term(s) actually match! Rewriting the query to
was slower on 3.2 (from 30ms for the combined regex to 45ms for the $and clauses) but much, much faster on 3.6 (from 30000ms to 70ms). |
| Comments |
| Comment by Nick Brewer [ 01/Oct/18 ] |
|
argh Since there hasn't been any activity on this ticket in some time, I'm going to close it. Feel free to comment here if you'd like us to reopen this issue. -Nick |
| Comment by Nick Brewer [ 14/Sep/18 ] |
|
argh Are you still seeing this issue? If so, we'll need the information previously requested to continue investigating. Thanks, |
| Comment by Nick Brewer [ 06/Aug/18 ] |
|
argh Sorry for the delay in getting back to you on this. Looking at the diagnostic data, I'm not seeing the sort of spikes in CPU utilization and wait times that you're describing. What are you using to determine the increase? It would be useful to see the .explain(true) output for both of these queries, and any log messages you're seeing when the queries are run. Thanks, |
| Comment by Roger Gonzalez [ 29/Jun/18 ] |
|
Done. This was recorded off the primary, let me know if you need anything off a secondary. |
| Comment by Nick Brewer [ 29/Jun/18 ] |
|
Hi argh Would you please archive (tar or zip) the $dbpath/diagnostic.data directory and attach it to this ticket? Thank you, |
| Comment by Roger Gonzalez [ 28/Jun/18 ] |
|
gah, apologies for the mangled formatting! |