[SERVER-16550] Text indexes should be able to provide sort on suffix fields Created: 15/Dec/14  Updated: 28/Dec/23

Status: Backlog
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.6.5
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Roy Reznik Assignee: Backlog - Query Integration
Resolution: Unresolved Votes: 4
Labels: qi-text-search
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to DOCS-4525 Clarify capabilities of compound inde... Closed
Assigned Teams:
Query Integration
Participants:
Case:

 Description   

I have the following text index:

{ name : 'text', modifiedDate: -1 }

in the getIndexes it looks like this:

        {
                "v" : 1,
                "key" : {
                        "_fts" : "text",
                        "_ftsx" : 1,
                        "modifiedDate" : -1
                },
                "name" : "name_text_modifiedDate_-1",
                "ns" : "testdb.testcol",
                "background" : true,
                "weights" : {
                        "name" : 1
                },
                "default_language" : "english",
                "language_override" : "language",
                "textIndexVersion" : 2
        },

When performing the following query:

db.testcol.find({ $text : { $search : 'text' } }).sort({modifiedDate:-1}).limit(20).explain(true)

I get the following:

{
        "cursor" : "TextCursor",
        "n" : 20,
        "nscannedObjects" : 14056,
        "nscanned" : 14056,
        "nscannedObjectsAllPlans" : 14056,
        "nscannedAllPlans" : 14056,
        "scanAndOrder" : true,
        "nYields" : 287,
        "nChunkSkips" : 0,
        "millis" : 12697,
        "allPlans" : [
                {
                        "cursor" : "TextCursor",
                        "n" : 20,
                        "nscannedObjects" : 14056,
                        "nscanned" : 14056,
                        "scanAndOrder" : true,
                        "nChunkSkips" : 0
                }
        ],
        "server" : "MONGOTEST:27017",
        "filterSet" : false,
        "stats" : {
                "type" : "SORT",
                "works" : 28077,
                "yields" : 287,
                "unyields" : 287,
                "invalidates" : 2351,
                "advanced" : 20,
                "needTime" : 28056,
                "needFetch" : 0,
                "isEOF" : 1,
                "forcedFetches" : 1,
                "memUsage" : 59431,
                "memLimit" : 33554432,
                "children" : [
                        {
                                "type" : "TEXT",
                                "works" : 28056,
                                "yields" : 287,
                                "unyields" : 287,
                                "invalidates" : 2351,
                                "advanced" : 13997,
                                "needTime" : 14058,
                                "needFetch" : 0,
                                "isEOF" : 1,
                                "keysExamined" : 14056,
                                "fetches" : 14056,
                                "parsedTextQuery" : {
                                        "terms" : [
                                                "text"
                                        ],
                                        "negatedTerms" : [ ],
                                        "phrases" : [ ],
                                        "negatedPhrases" : [ ]
                                },
                                "children" : [ ]
                        }
                ]
        }
}

It is doing a scan & order query instead of using the modifiedDate:-1 part of the index to sort the results...



 Comments   
Comment by Roy Reznik [ 11/Jun/17 ]

Any news? Open for 2.5 years now.

Comment by Luiz Felipe Mendes [ 18/Oct/15 ]

This would be very useful for me too. I need to sort by the date and do a text search, this is a very common use case.

Comment by Roy Reznik [ 15/Dec/14 ]

Jason,
Thanks so much for the quick response.
I think it should also be noted in the documentation of text index as a limitation of a compound index that includes text until it's resolved.

Comment by J Rassi [ 15/Dec/14 ]

The text search subsystem intentionally never supported this use case, though (off the top of my head) I can't think of a reason why this couldn't be implemented. Changing the type of this issue from "bug" to "improvement", and dropping in "planned but not scheduled".

This work would require a major refactor of the text search execution path (the TEXT stage would have to be split into a TEXT_SCAN and a TEXT_FILTER stage), and then this class of query would need query planner support (to support queries with multiple search terms, multiple TEXT_SCAN stages would need to be created under a SORT_MERGE stage).

Generated at Thu Feb 08 03:41:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.