[SERVER-26752] $indexOfBytes (and OfCP) should use Boyer-Moore algorithm Created: 24/Oct/16  Updated: 21/Sep/23

Status: Backlog
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Asya Kamsky Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: neweng, pull-request
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Execution
Participants:

 Description   

$indexOfCP and $indexOfBytes can share code (they currently don't) and can use Boyer-Moore to improve performance.



 Comments   
Comment by Mathias Stearn [ 24/Oct/16 ]

We are already using boost::boyer_moore_search to do phrase matching in full-text search. It should be easy to use it here too. This implementation should check if the needle is constant and use the object interface to avoid building the lookup-table for every execution (it may only be worth using Boyer-Moore in this common case and not for the dynamic-needle case, I'm not sure). It should also special-case when the needle is 1 byte and use the CharT overload of std::string::find since that will be much faster.

Generated at Thu Feb 08 04:13:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.