Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-49046

Provide the ability to perform a collation aware search only on the fields on which the index is defined

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Duplicate
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • None
    • Query 2020-07-27

    Description

      If we have an index with a certain collation defined, then a search performed using that index will use the collation not only for that field but for other fields as well.

      Consider a test collection with its corresponding indexes like this:

      > db.coll_insens_search_test.find()
      { "_id" : 1, "A" : "goat", "B" : "core" }
      { "_id" : 2, "A" : "goat", "B" : "CORE" }
      { "_id" : 3, "A" : "GoAt", "B" : "core" }
      { "_id" : 4, "A" : "GOAT", "B" : "CORE" }
      { "_id" : 5, "A" : "GOat", "B" : "core" }
      

      > db.coll_insens_search_test.getIndexes()
      [
      	{
      		"v" : 2,
      		"key" : {
      			"_id" : 1
      		},
      		"name" : "_id_",
      		"ns" : "test.coll_insens_search_test"
      	},
      	{
      		"v" : 2,
      		"key" : {
      			"B" : 1
      		},
      		"name" : "HD_CaseInsensitiveIndex",
      		"ns" : "test.coll_insens_search_test",
      		"collation" : {
      			"locale" : "en_US",
      			"caseLevel" : false,
      			"caseFirst" : "off",
      			"strength" : 2,
      			"numericOrdering" : false,
      			"alternate" : "non-ignorable",
      			"maxVariable" : "punct",
      			"normalization" : false,
      			"backwards" : false,
      			"version" : "57.1"
      		}
      	}
      ]
      

      The case-insensitive index above is only on field B and not on field A. However, the following query yields all 5 documents:

      > db.coll_insens_search_test.find({ A: "GOat", B:"core" }).collation({ locale: "en_US", strength: 2 })
      { "_id" : 1, "A" : "goat", "B" : "core" }
      { "_id" : 2, "A" : "goat", "B" : "CORE" }
      { "_id" : 3, "A" : "GoAt", "B" : "core" }
      { "_id" : 4, "A" : "GOAT", "B" : "CORE" }
      { "_id" : 5, "A" : "GOat", "B" : "core" }
      

      In other words, we see that the case-insensitive search is being performed not only for B on which the index is defined, but also for A which is not covered in the index (it is debatable whether this is the correct behavior). Otherwise it would've only returned the last document with "_id" : 5.

       
      The winningPlan for the above query looks like this:

      "winningPlan" : {
      			"stage" : "FETCH",
      			"filter" : {
      				"A" : {
      					"$eq" : "GOat"
      				}
      			},
      			"inputStage" : {
      				"stage" : "IXSCAN",
      				"keyPattern" : {
      					"B" : 1
      				},
      				"indexName" : "HD_CaseInsensitiveIndex",
      				"collation" : {
      					"locale" : "en_US",
      					"caseLevel" : false,
      					"caseFirst" : "off",
      					"strength" : 2,
      					"numericOrdering" : false,
      					"alternate" : "non-ignorable",
      					"maxVariable" : "punct",
      					"normalization" : false,
      					"backwards" : false,
      					"version" : "57.1"
      				},
      				"isMultiKey" : false,
      				"multiKeyPaths" : {
      					"B" : [ ]
      				},
      				"isUnique" : false,
      				"isSparse" : false,
      				"isPartial" : false,
      				"indexVersion" : 2,
      				"direction" : "forward",
      				"indexBounds" : {
      					"B" : [
      						"[\"-EK1\u0001\b\", \"-EK1\u0001\b\"]"
      					]
      				}
      			}
      		}
      

       

      Without using the collation, it yields only the 5th document as expected, as shown here:

      > db.coll_insens_search_test.find({ A: "GOat", B:"core" })
      { "_id" : 5, "A" : "GOat", "B" : "core" }
      

       

      This is a feature request to provide the ability to perform a collation aware search only on the fields on which the index is defined, and not expand it to other fields which are not covered in the index.

      Attachments

        Activity

          People

            david.storch@mongodb.com David Storch
            harshad.dhavale@mongodb.com Harshad Dhavale
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: