[SERVER-29883] "|" in query string occur into a full index scan, slow query Created: 28/Jun/17  Updated: 29/Jul/17  Resolved: 28/Jun/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: dancer Assignee: Kelsey Schubert
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File explain1     HTML File explain2    
Issue Links:
Duplicate
duplicates SERVER-20432 $regex prefix search with escaped "|"... Closed
duplicates SERVER-16622 RegEx query predicates using the | (v... Backlog
Operating System: ALL
Participants:

 Description   

db.wd_data.find({"path":{"$in":[/^\/PersonalWx\/Wx3001\/A李鑫老师助理6965号刘赫楠\/WxTxl\/三119扫单D!nG先生Super7李先生\//]}}

vs

db.wd_data.find({"path":{"$in":[/^\/PersonalWx\/Wx3001\/A李鑫老师助理6965号刘赫楠\/WxTxl\/三119扫单D!nG|先生Super7李先生\//]}}

the difference is the first one has no "|" in the query string, the second one does.
I will attach the explain() info, they show the difference



 Comments   
Comment by dancer [ 05/Jul/17 ]

Since the issue is not fixed,I suggest user who want to match prefix to use "$gte + $lt" instead of regex.

Comment by Kelsey Schubert [ 28/Jun/17 ]

Hi dancer,

Thanks for the clarification. After review, we are going to continue to track this work in SERVER-16622. For additional context, please see SERVER-20432, which describes the same issue as this ticket and points to SERVER-16622 as well.

David Storch's comment clarifies this decision:

After reviewing this ticket, the engineering team responsible for the "Querying" component has decided to consider this a duplicate of SERVER-16622. Fixing the backslash-escaped "|" character case makes sense to do as part of the larger ticket, as this will require parsing the regular expression and analyzing the parse tree. From an engineering perspective, we would much rather use proper regex parsing than introduce a hack that special cases the string "|".

Kind regards,
Thomas

Comment by dancer [ 28/Jun/17 ]

I have viewed the previous issue, But I don't think it is a duplicated issue.
In my case, regexes was used to match prefix."|" was escaped(just a normal character),but mongodb still scan the whole index.
So it might be a new bug..

Comment by Kelsey Schubert [ 28/Jun/17 ]

Hi dancer,

Thank you for reporting this issue. The work to improve this behavior is tracked in SERVER-16622. Please feel free to vote for SERVER-16622 and watch it for updates.

Kind regards,
Thomas

Generated at Thu Feb 08 04:22:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.