[SERVER-19936] Performance pass on unicode-aware text processing logic (text index v3) Created: 13/Aug/15 Updated: 17/Oct/17 Resolved: 15/Mar/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance, Querying, Text Search |
| Affects Version/s: | None |
| Fix Version/s: | 3.2.5, 3.3.3 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | David Daly | Assignee: | Mathias Stearn |
| Resolution: | Done | Votes: | 1 |
| Labels: | code-and-test | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Backport Completed: | |||||||||||||||||||||||||
| Sprint: | Integration F (02/01/16), Integration 10 (02/22/16), Integration 11 (03/14/16), Integration 12 (04/04/16) | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
There was a performance regression from the introduction of text index version 3, visible in the Mongo-perf Queries.Text tests. There should be a passthrough the code to try to improve performance. Initial Results showing regression. |
| Comments |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'redbeard0531@gmail.com'}Message: (cherry picked from commit d89cf868a3987caa0ceeac576173f3fdd90f00ca) |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Fixes compilation errors introduced by (cherry picked from commit 4b6952e97e74d8c7bd16ebfc5fe6e412ccf0f48c) |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: (cherry picked from commit 657288e29880c0c8518452880715d57effdbeb89) |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: (cherry picked from commit 4b10e50494175df2b1ed8fc4f8e7f8c6ca6f06d5) |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: (cherry picked from commit 72aab77138463d96494389bc538c13395c34a2d3) |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: (cherry picked from commit 35f4f2f5a58e5dc90b583e8bc6089eaa2d83e065) |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Now handles up to 16 bytes of ASCII at a time if SSE2 is enabled. (cherry picked from commit 67eee08bb606537df7417670d423c0527dd6221f) |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Major changes:
(cherry picked from commit 6c3157f126bb44ab275325e85de7abee5ce9ad6d) |
| Comment by Githook User [ 15/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Needed for boost::boyer_moore_searcher. (cherry picked from commit 4a35c7184e188354793f16d27e2330b3b5ce7f8f) |
| Comment by Githook User [ 14/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Fixes compilation errors introduced by |
| Comment by Githook User [ 14/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Fixes compilation errors introduced by |
| Comment by Githook User [ 11/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'redbeard0531@gmail.com'}Message: |
| Comment by Githook User [ 11/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: |
| Comment by Githook User [ 11/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: |
| Comment by Githook User [ 11/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: |
| Comment by Githook User [ 11/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: |
| Comment by Githook User [ 11/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Now handles up to 16 bytes of ASCII at a time if SSE2 is enabled. |
| Comment by Githook User [ 11/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Major changes:
|
| Comment by Githook User [ 11/Mar/16 ] |
|
Author: {u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}Message: Needed for boost::boyer_moore_searcher. |
| Comment by J Rassi [ 03/Dec/15 ] |
|
Further investigation shows that the magnitude of the performance difference of text index version 2 versus 3 is as large as ~20x for certain workloads. With 3.2.0-rc6 configured with the WiredTiger storage engine, the phrase search operation count({$text: {$search: "\"gigantic hound\""}}) against the data set attached to I've attached to this ticket a dot graph generated with Linux perf of the phrase query workload with text index version 3. Interesting observations:
Moving ticket back into Needs Triage. |
| Comment by David Daly [ 01/Oct/15 ] |
|
Note: performance targets needed to be updated after this goes in. See |
| Comment by David Daly [ 17/Aug/15 ] |
|
Re-opening as Here's the perf data for the commit after |
| Comment by Daniel Pasette (Inactive) [ 14/Aug/15 ] |
|
Duplicate of |
| Comment by J Rassi [ 13/Aug/15 ] |
|
Thanks. Triaged to Planning Bucket A and unassigned. |
| Comment by David Daly [ 13/Aug/15 ] |
|
rassi@10gen.com Moved and assigned to you for now. I put it in Needs Triage for now. |
| Comment by Adam Chelminski (Inactive) [ 13/Aug/15 ] |
|
I'm currently adding some simple optimizations that should improve the performance slightly, but as Rassi said, this regression was expected. |
| Comment by J Rassi [ 13/Aug/15 ] |
|
I briefly discussed with Dan. Yes, this slowdown is expected and of a reasonable magnitude. We should do a performance pass on the new code to make up for some of the regression, but we do not think this should be scheduled for 3.1.x and are happy to ship with this feature performing as-is. David, could you file a SERVER ticket (or convert this to a SERVER ticket) to track this work, and we'll consider it for 3.3.x or beyond? |
| Comment by David Daly [ 13/Aug/15 ] |
|
mpobrien Seems likely. I have that commit and it's neighbors scheduled to run. adam.chelminski rassi@10gen.com mark.benvenuto |
| Comment by Michael O'Brien [ 13/Aug/15 ] |
|
Probably this commit i'm guessing: https://evergreen.mongodb.com/task/performance_linux_wt_standalone_query_92eac3b57d8beaf063fced8839cd870f97826bb7_15_08_11_20_58_14 |