Details
-
Improvement
-
Resolution: Duplicate
-
Major - P3
-
None
-
None
-
None
-
None
-
Query 2020-07-27
-
(copied to CRM)
Description
If we have an index with a certain collation defined, then a search performed using that index will use the collation not only for that field but for other fields as well.
Consider a test collection with its corresponding indexes like this:
> db.coll_insens_search_test.find()
|
{ "_id" : 1, "A" : "goat", "B" : "core" }
|
{ "_id" : 2, "A" : "goat", "B" : "CORE" }
|
{ "_id" : 3, "A" : "GoAt", "B" : "core" }
|
{ "_id" : 4, "A" : "GOAT", "B" : "CORE" }
|
{ "_id" : 5, "A" : "GOat", "B" : "core" }
|
> db.coll_insens_search_test.getIndexes()
|
[
|
{
|
"v" : 2,
|
"key" : {
|
"_id" : 1
|
},
|
"name" : "_id_",
|
"ns" : "test.coll_insens_search_test"
|
},
|
{
|
"v" : 2,
|
"key" : {
|
"B" : 1
|
},
|
"name" : "HD_CaseInsensitiveIndex",
|
"ns" : "test.coll_insens_search_test",
|
"collation" : {
|
"locale" : "en_US",
|
"caseLevel" : false,
|
"caseFirst" : "off",
|
"strength" : 2,
|
"numericOrdering" : false,
|
"alternate" : "non-ignorable",
|
"maxVariable" : "punct",
|
"normalization" : false,
|
"backwards" : false,
|
"version" : "57.1"
|
}
|
}
|
]
|
The case-insensitive index above is only on field B and not on field A. However, the following query yields all 5 documents:
> db.coll_insens_search_test.find({ A: "GOat", B:"core" }).collation({ locale: "en_US", strength: 2 })
|
{ "_id" : 1, "A" : "goat", "B" : "core" }
|
{ "_id" : 2, "A" : "goat", "B" : "CORE" }
|
{ "_id" : 3, "A" : "GoAt", "B" : "core" }
|
{ "_id" : 4, "A" : "GOAT", "B" : "CORE" }
|
{ "_id" : 5, "A" : "GOat", "B" : "core" }
|
In other words, we see that the case-insensitive search is being performed not only for B on which the index is defined, but also for A which is not covered in the index (it is debatable whether this is the correct behavior). Otherwise it would've only returned the last document with "_id" : 5.
The winningPlan for the above query looks like this:
"winningPlan" : {
|
"stage" : "FETCH",
|
"filter" : {
|
"A" : {
|
"$eq" : "GOat"
|
}
|
},
|
"inputStage" : {
|
"stage" : "IXSCAN",
|
"keyPattern" : {
|
"B" : 1
|
},
|
"indexName" : "HD_CaseInsensitiveIndex",
|
"collation" : {
|
"locale" : "en_US",
|
"caseLevel" : false,
|
"caseFirst" : "off",
|
"strength" : 2,
|
"numericOrdering" : false,
|
"alternate" : "non-ignorable",
|
"maxVariable" : "punct",
|
"normalization" : false,
|
"backwards" : false,
|
"version" : "57.1"
|
},
|
"isMultiKey" : false,
|
"multiKeyPaths" : {
|
"B" : [ ]
|
},
|
"isUnique" : false,
|
"isSparse" : false,
|
"isPartial" : false,
|
"indexVersion" : 2,
|
"direction" : "forward",
|
"indexBounds" : {
|
"B" : [
|
"[\"-EK1\u0001\b\", \"-EK1\u0001\b\"]"
|
]
|
}
|
}
|
}
|
Without using the collation, it yields only the 5th document as expected, as shown here:
> db.coll_insens_search_test.find({ A: "GOat", B:"core" })
|
{ "_id" : 5, "A" : "GOat", "B" : "core" }
|
This is a feature request to provide the ability to perform a collation aware search only on the fields on which the index is defined, and not expand it to other fields which are not covered in the index.
Attachments
Issue Links
- duplicates
-
SERVER-25954 Support more granular collation specification
-
- Backlog
-