[CSHARP-4885] Compound indexes that have the same field twice should be allowed for text indexes Created: 16/Dec/23  Updated: 29/Jan/24  Resolved: 29/Jan/24

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Unknown
Reporter: Sebastian Stehle Assignee: Adelin Mbida Owona
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

I have a collection with a text index and an additional field called schemaId. For some of my queries this queries this is required and for some other queries it is preferred to return results from the same schemaId. Therefore I created my index like that:

it is a large collection, therefore i use small field names. Saved a few Gigs.

await collection.Indexes.CreateOneAsync(
    new CreateIndexModel<MongoTextIndexEntity<List<MongoTextIndexEntityText>>>(
        Index // just a shortcut for Builder<>.Index
            .Ascending(x => x.SchemaId)
            .Text("t.t")
            .Text(x => x.SchemaId),
        new CreateIndexOptions
        {
            Weights = new BsonDocument
            {
                ["t.t"] = 2,
                ["s"] = 1
            }
        }),
    cancellationToken: ct); 

Or I would like to do that. But I get the following error:

> The index keys definition contains multiple values for the field 's'. (SchemaId)

But this is a restriction of the driver, not of MongoDB.

The problem is, that the field names are not used for the index definition. Instead the following index is created:

{
        "v" : 2.0,
        "key" : {
            "a" : 1.0,
            "s" : 1.0,
            "_fts" : "text",
            "_ftsx" : 1.0
        },
        "name" : "a_1_t.t_text_s_text",
        "weights" : {
            "s" : 1.0,
            "t.t" : 1.0
        },
        "default_language" : "english",
        "language_override" : "language",
        "textIndexVersion" : 3.0
    }
 

Therefore this limitation does not really exist. But I am not sure if this depends on specific versions of MongoDB or not.

 



 Comments   
Comment by PM Bot [ 29/Jan/24 ]

There hasn't been any recent activity on this ticket, so we're resolving it. Thanks for reaching out! Please feel free to reopen this ticket if you're still experiencing the issue, and add a comment if you're able to provide more information.

Comment by PM Bot [ 19/Jan/24 ]

Hi mail2stehle@gmail.com! CSHARP-4885 is awaiting your response.

If this is still an issue for you, please open Jira to review the latest status and provide your feedback. Thanks!

Comment by Adelin Mbida Owona [ 11/Jan/24 ]

Hi mail2stehle@gmail.com, although your workaround may currently produce the expected results, it may result in undefined behaviour now or in the future. This is not a C# driver issue, I would suggest filing a SERVER ticket to request support for this new feature. Once the server supports this functionality in a defined manner, then drivers can consider support in the future.

Comment by Sebastian Stehle [ 09/Jan/24 ]

I think you misunderstand me. It is correct that it is not supported but you can workaround that:

1. Create an index with the following configuration: { "t.t": "text", schemaId: "text" }

2. Download the index definition. There is no schemaId field anymore, only _fts, and _ftsx

3. Add the schemaId field: { schemaId: 1 }

 

If I understand it properly you have the field twice in the index after that.

Comment by Adelin Mbida Owona [ 08/Jan/24 ]

Hi mail2stehle@gmail.com, building a compound index with the same field in it twice or more is not supported by MongoDB. A BSON document represents the index definition to the server. BSON is a key-value pair collection, not a dictionary but it is exposed as a dictionary in most programming languages. Thus having duplicate keys in a BSON document results in undefined behavior. 

It seems you are trying to create a single index to satisfy multiple, incompatible query shapes. I would suggest creating two separate indexes: {{

{ schemaId: 1, "t.t": "text" }

}} and {{

{ "t.t": "text", schemaId: "text" }

for your purposes. }}

Comment by Sebastian Stehle [ 24/Dec/23 ]

Hi adelin.mbidaowona@mongodb.com 

I use a local docker instance. Cannot tell you the exact version. Indeed it does not work when you create the index in one step, because you cannot use the same field twice. But when you first create the full text index only you can later add the same field as a prefix, because the original full text index is converted to a fts and _{}ftsx fields.

Comment by Adelin Mbida Owona [ 22/Dec/23 ]

Hi mail2stehle@gmail.com, by having a text index key in a compound index you are essentially creating a compound text index which has some limitations which can be read about in the MongoDB docs for compound index or text index. You can also read about the different behavior of compound indexes with different index types combinations here. The _fts and _ftsx fields (fts: full-text search) are present to signal that an index is a text index as a text index is powered by the full-text search feature.

Also which mongodb version are you using? I think trying to have an ascending index and text index on the same field should result in the ascending index being ignored if you were to try this on a local mongod server instance.

Comment by Sebastian Stehle [ 16/Dec/23 ]

I think it is also a limitation of MongoDB. because the way I created the index was to create it without schema name first and then use the "Edit Index" feature in Studio 3T to add the additional field. Unfortunately, this _fts and _ftsx thing is not really documented

Comment by PM Bot [ 16/Dec/23 ]

Hi mail2stehle@gmail.com, thank you for reporting this issue! The team will look into it and get back to you soon.

Generated at Wed Feb 07 21:49:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.