[SERVER-28282] Regex search on an indexed field, gives unexpected behavior after using collation Created: 12/Mar/17  Updated: 27/Oct/23  Resolved: 13/Mar/17

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 3.4.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jesper Erik Bendtsen Assignee: Mark Agarunov
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-9933 Explain that regex is unable to take ... Closed
Operating System: ALL
Participants:

 Description   

I was expecting the following result for below scenarios, but it's not?
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,

// Without collation
db.createCollection("test");
db.test.createIndex({name: 1});
db.test.insert({name: "Anders"});
db.test.insert({name: "Bo"});
db.test.insert({name: "Ole"});
db.test.insert({name: "Peter"});
db.test.insert({name: "Hans"});
db.test.find({"name": {$regex: /^and.*/i}}).explain("executionStats");

Result:
"totalKeysExamined" : 5,
"totalDocsExamined" : 1,

// Collection with collation
db.createCollection("testWithCollation", {collation: {locale: "en", strength: 2}});
db.testWithCollation.createIndex({name: 1});
db.testWithCollation.insert({name: "Anders"});
db.testWithCollation.insert({name: "Bo"});
db.testWithCollation.insert({name: "Ole"});
db.testWithCollation.insert({name: "Peter"});
db.testWithCollation.insert({name: "Hans"});
db.testWithCollation.find({"name": {$regex: /^and.*/i}}).explain("executionStats");

Result:
"totalKeysExamined" : 5,
"totalDocsExamined" : 5,

// Collection with an collation index field
db.createCollection("testWithCollationIndex");
db.testWithCollationIndex.createIndex({name: 1}, {collation: {locale: 'en', strength: 2}});
db.testWithCollationIndex.insert({name: "Anders"});
db.testWithCollationIndex.insert({name: "Bo"});
db.testWithCollationIndex.insert({name: "Ole"});
db.testWithCollationIndex.insert({name: "Peter"});
db.testWithCollationIndex.insert({name: "Hans"});
db.testWithCollationIndex.find({"name": {$regex: /^and.*/i}}).collation({locale: 'en', strength: 2}).explain("executionStats");

Result:
"totalKeysExamined" : 5,
"totalDocsExamined" : 5,



 Comments   
Comment by Mark Agarunov [ 13/Mar/17 ]

Hello jesperbendtsen83@gmail.com,

Thank you for the report. Looking over the output you've provided, it appears that this is the expected behavior. Unfortunately the regex implementation is not collation-aware and therefore cannot use indexes with a collation specified. We've opened DOCS-9933 to clarify this in the documentation.

Thanks,
Mark

Generated at Thu Feb 08 04:17:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.