[SERVER-31089] createIndex fails with "Index type 'text' does not support collation" but applyops command succeeds for creating the same index Created: 14/Sep/17  Updated: 27/Oct/23  Resolved: 19/Sep/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ankur Srivastava (Inactive) Assignee: Tess Avitabile (Inactive)
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-31161 Index created through applyOps comman... Closed
is related to TOOLS-1801 text index collation creation may req... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

create a collection:

db.createCollection("customers", {collation: {locale: "en_US", caseLevel: false, caseFirst: "off", strength: 2, numericOrdering: false, alternate: "non-ignorable", maxVariable: "punct", normalization: false, backwards: false}})

Create index with collation

db.customers.createIndex({ "_fts" : "text" , "_ftsx" : 1}, {name: "CustomerText", collation: { locale: "simple"}, "weights" : { "Locations.Address1" : 1 , "Locations.City" : 1 , "Locations.Contacts.FirstName" : 1 , "Locations.Contacts.LastName" : 1 , "Locations.Name" : 1 , "Name" : 1}})

Here is the result of getIndexes and oplog entry. Notice that collation specification is not present here.

db.customers.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "bkp.customers",
		"collation" : {
			"locale" : "en_US",
			"caseLevel" : false,
			"caseFirst" : "off",
			"strength" : 2,
			"numericOrdering" : false,
			"alternate" : "non-ignorable",
			"maxVariable" : "punct",
			"normalization" : false,
			"backwards" : false,
			"version" : "57.1"
		}
	},
	{
		"v" : 2,
		"key" : {
			"_fts" : "text",
			"_ftsx" : 1
		},
		"name" : "CustomerText",
		"ns" : "bkp.customers",
		"weights" : {
			"Locations.Address1" : 1,
			"Locations.City" : 1,
			"Locations.Contacts.FirstName" : 1,
			"Locations.Contacts.LastName" : 1,
			"Locations.Name" : 1,
			"Name" : 1
		},
		"default_language" : "english",
		"language_override" : "language",
		"textIndexVersion" : 3
	}
]

{ "ts" : Timestamp(1505396756, 1), "t" : NumberLong(1), "h" : NumberLong("3117262685112057819"), "v" : 2, "op" : "i", "ns" : "bkp.system.indexes", "o" : { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "CustomerText", "ns" : "bkp.customers", "weights" : { "Locations.Address1" : 1, "Locations.City" : 1, "Locations.Contacts.FirstName" : 1, "Locations.Contacts.LastName" : 1, "Locations.Name" : 1, "Name" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } }

try to create index without collation:

db.customers.dropIndex("CustomerText")
db.customers.createIndex({ "_fts" : "text" , "_ftsx" : 1}, {name: "CustomerText", "weights" : { "Locations.Address1" : 1 , "Locations.City" : 1 , "Locations.Contacts.FirstName" : 1 , "Locations.Contacts.LastName" : 1 , "Locations.Name" : 1 , "Name" : 1}})
{
	"ok" : 0,
	"errmsg" : "Index type 'text' does not support collation: { locale: \"en_US\", caseLevel: false, caseFirst: \"off\", strength: 2, numericOrdering: false, alternate: \"non-ignorable\", maxVariable: \"punct\", normalization: false, backwards: false, version: \"57.1\" }",
	"code" : 67,
	"codeName" : "CannotCreateIndex"
}

try to create index without collation using applyops:

 db.runCommand({applyOps: [{ "ts" : Timestamp(1505396756, 1), "t" : NumberLong(1), "h" : NumberLong("3117262685112057819"), "v" : 2, "op" : "i", "ns" : "bkp.system.indexes", "o" : { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "CustomerText", "ns" : "bkp.customers", "weights" : { "Locations.Address1" : 1, "Locations.City" : 1, "Locations.Contacts.FirstName" : 1, "Locations.Contacts.LastName" : 1, "Locations.Name" : 1, "Name" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } }]})
{ "applied" : 1, "results" : [ true ], "ok" : 1 }

Participants:

 Description   

If a collection is created with collation specification, creation of a text index fails if {collation.locale: "simple"} is not provided during index creation. Creating the same index using applyOps command succeeds even if the {collation.locale: "simple"} is not provided in the applyops command.
This behaviour is breaking backup initial sync, because during initial sync it creates index using createIndex command. Backup does not know about the collation, because 'db.collection.geIndexes' and oplog for index creation does not have collation info.



 Comments   
Comment by Shane Harvey [ 07/Nov/17 ]

We opted for #2 since it was the simplest and did not suffer from having multiple representations of the simple collation.

We still have multiple representations of the simple collation; {collation:{ locale: "simple"}} for createIndexes and non-existent for getIndexes. We've just punted the multiple representation problem onto the user. I think #4 would have been (and still would be) a better solution.

Comment by David Storch [ 29/Sep/17 ]

No, that is intentional. Indexes created on versions prior to 3.4, of course, know nothing about collation and therefore have no collation-related metadata in the catalog. While developing the collation project, this fact gave us a few choices:

  1. Represent the simple collation using two formats in the catalog and report these two formats in getIndexes().
  2. Always omit the collation in the case of the simple collation.
  3. Implement an upgrade process that backfills the index catalog for old indexes that have the simple collation.
  4. Don't store things differently in the catalog, but always modify the catalog entry to add {collation: "simple"} in the getIndexes() code.

We opted for #2 since it was the simplest and did not suffer from having multiple representations of the simple collation.

Comment by Shane Harvey [ 29/Sep/17 ]

Isn't it a bug that the simple collation is not reported by getIndexes?

Comment by Tess Avitabile (Inactive) [ 19/Sep/17 ]

When there is no collation in the getIndexes output, it means that the index has the simple collation. Then if the collection has a non-simple default collation, backup must specify {collation: {locale: "simple"}} in the createIndexes command. Otherwise, the index will inherit the collection default collation, which will be incorrect.

Comment by Ankur Srivastava (Inactive) [ 19/Sep/17 ]

I am using 3.4.7. for me, applyOps commad succeeds. Note that, I was using bkp as the database.

MongoDB shell version v3.4.7
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.7
Server has startup warnings: 
2017-09-19T13:34:54.445-0400 I CONTROL  [initandlisten] 
2017-09-19T13:34:54.445-0400 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-09-19T13:34:54.445-0400 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-09-19T13:34:54.446-0400 I CONTROL  [initandlisten] 
> use bkp
switched to db bkp
> db.createCollection("customers", {collation: {locale: "en_US", caseLevel: false, caseFirst: "off", strength: 2, numericOrdering: false, alternate: "non-ignorable", maxVariable: "punct", normalization: false, backwards: false}})
{ "ok" : 1 }
> 
>  db.runCommand({applyOps: [{ "ts" : Timestamp(1505396756, 1), "t" : NumberLong(1), "h" : NumberLong("3117262685112057819"), "v" : 2, "op" : "i", "ns" : "bkp.system.indexes", "o" : { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "CustomerText", "ns" : "bkp.customers", "weights" : { "Locations.Address1" : 1, "Locations.City" : 1, "Locations.Contacts.FirstName" : 1, "Locations.Contacts.LastName" : 1, "Locations.Name" : 1, "Name" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } }]})
{ "applied" : 1, "results" : [ true ], "ok" : 1 }
> db.customers.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "bkp.customers",
		"collation" : {
			"locale" : "en_US",
			"caseLevel" : false,
			"caseFirst" : "off",
			"strength" : 2,
			"numericOrdering" : false,
			"alternate" : "non-ignorable",
			"maxVariable" : "punct",
			"normalization" : false,
			"backwards" : false,
			"version" : "57.1"
		}
	},
	{
		"v" : 2,
		"key" : {
			"_fts" : "text",
			"_ftsx" : 1
		},
		"name" : "CustomerText",
		"ns" : "bkp.customers",
		"weights" : {
			"Locations.Address1" : 1,
			"Locations.City" : 1,
			"Locations.Contacts.FirstName" : 1,
			"Locations.Contacts.LastName" : 1,
			"Locations.Name" : 1,
			"Name" : 1
		},
		"default_language" : "english",
		"language_override" : "language",
		"textIndexVersion" : 3
	}
]

The problem is that the backup does not have collation information to pass to createIndex command. If you look at getIndexes output above, there is no way of knowing what collation was used to create the index. So backup tries to create the index without specifying the collation {locale: "simple"}, and it fails.

Comment by Tess Avitabile (Inactive) [ 19/Sep/17 ]

ankur.srivastava, note that if there is no collation on the index in the getIndexes response or the oplog entry for index creation, it is always safe to specify {collation: {locale: "simple"}} in the createIndexes command. Though it is only necessary if the collection has a non-simple collation.

Comment by Tess Avitabile (Inactive) [ 19/Sep/17 ]

Thanks, anonymous.user, I have confirmed the behavior on the 3.4 branch. I'm going to close this as Works as Designed and open a bug for the behavior on master.

Comment by Kelsey Schubert [ 19/Sep/17 ]

tess.avitabile, this was reported against 3.4.7.

Comment by Tess Avitabile (Inactive) [ 19/Sep/17 ]

What version of the server are you using? When I tried the repro, the applyOps command failed.

However, I believe it is a bug that the applyOps command failed. When creating an index, the applyOps command should not add the collection default collation to the index spec. It should behave identically to replicating an oplog entry that creates an index, which does not add the collection default collation to the index spec.

The createIndexes command should add the collection default collation to the index spec. In the case of a collection with a non-simple default collation, if backup uses the createIndexes command to create an index with the simple collation, it must explicitly specify the collation {locale: "simple"}.

Generated at Thu Feb 08 04:25:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.