[SERVER-38245] Blocking sort when index collation options don't match operation collation options Created: 26/Nov/18  Updated: 27/Oct/23  Resolved: 29/Nov/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: zhang mengzhi Assignee: Eric Sedor
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Comments   
Comment by Eric Sedor [ 29/Nov/18 ]

For an operation with a collation option to use a given index, the index and the operation must share the same collation option. A collation option will only be applied to the _id index if the collection is created with a default collation.

For queries on other indexes than the _id index, you can specify the collation at index creation time (thus overriding the default collation on the collection).

If you'd like to explore this in more detail, we recommend the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group.

Comment by zhang mengzhi [ 29/Nov/18 ]

why did not the sorting by ObjectId ignore the "collation.locale"

Comment by Eric Sedor [ 28/Nov/18 ]

Thank you. It looks like the datas collection does not have the {locale: 'zh'} collation.

Can you please try using the createCollection command with the {collation:{locale: 'zh'}} option and let us know if you till see any issue? (see collation at https://docs.mongodb.com/manual/reference/method/db.createCollection/#db.createCollection)

Comment by zhang mengzhi [ 28/Nov/18 ]

It takes 67s to explain without "locale", and the "docsExamined" value is 12498913 :

db.datas.explain(true).aggregate([
    { $sort: { _id: 1 } },
    { $limit: 20 }
]);

but it takes 0.041s to aggregate without "locale" : 

db.datas.aggregate([
    { $sort: { _id: 1 } },
    { $limit: 20 }
]);

 

 

Comment by zhang mengzhi [ 28/Nov/18 ]

{
  "name" : "datas",
  "type" : "collection",
  "options" : {
 
  },
  "info" : {
    "readOnly" : false,
    "uuid" : UUID("241ca32c-fd4f-47d6-b34c-5402f9b3ed8c")
  },
  "idIndex" : {
    "v" : 1,
    "key" : {
      "_id" : 1
    },
    "name" : "_id_",
    "ns" : "finex.datas"
  }
}

{
  "splitPipeline" : null,
  "shards" : {
    "shard0001" : {
      "host" : "localhost:27021",
      "stages" : [
        {
          "$cursor" : {
            "query" : {
 
            },
            "queryPlanner" : {
              "plannerVersion" : 1,
              "namespace" : "finex.datas",
              "indexFilterSet" : false,
              "parsedQuery" : {
 
              },
              "collation" : {
                "locale" : "zh",
                "caseLevel" : false,
                "caseFirst" : "off",
                "strength" : 3,
                "numericOrdering" : false,
                "alternate" : "non-ignorable",
                "maxVariable" : "punct",
                "normalization" : false,
                "backwards" : false,
                "version" : "57.1"
              },
              "winningPlan" : {
                "stage" : "SHARDING_FILTER",
                "inputStage" : {
                  "stage" : "COLLSCAN",
                  "direction" : "forward"
                }
              },
              "rejectedPlans" : [ ]
            },
            "executionStats" : {
              "executionSuccess" : true,
              "nReturned" : 12498913,
              "executionTimeMillis" : 52773,
              "totalKeysExamined" : 0,
              "totalDocsExamined" : 12498913,
              "executionStages" : {
                "stage" : "SHARDING_FILTER",
                "nReturned" : 12498913,
                "executionTimeMillisEstimate" : 49400,
                "works" : 12498915,
                "advanced" : 12498913,
                "needTime" : 1,
                "needYield" : 0,
                "saveState" : 97676,
                "restoreState" : 97676,
                "isEOF" : 1,
                "invalidates" : 0,
                "chunkSkips" : 0,
                "inputStage" : {
                  "stage" : "COLLSCAN",
                  "nReturned" : 12498913,
                  "executionTimeMillisEstimate" : 15854,
                  "works" : 12498915,
                  "advanced" : 12498913,
                  "needTime" : 1,
                  "needYield" : 0,
                  "saveState" : 97676,
                  "restoreState" : 97676,
                  "isEOF" : 1,
                  "invalidates" : 0,
                  "direction" : "forward",
                  "docsExamined" : 12498913
                }
              },
              "allPlansExecution" : [ ]
            }
          }
        },
        {
          "$sort" : {
            "sortKey" : {
              "_id" : 1
            },
            "limit" : 20
          }
        }
      ]
    }
  },
  "ok" : 1,
  "operationTime" : Timestamp(1543367242, 1),
  "$clusterTime" : {
    "clusterTime" : Timestamp(1543367242, 1),
    "signature" : {
      "hash" : BinData(0,"P4qAJKD0BXMdqq37uMp6GX9QnBQ="),
      "keyId" : 6617770942087036929
    }
  }
}

Comment by Eric Sedor [ 27/Nov/18 ]

Hello!

Can you please check and provide the results of the following two operations:

db.getCollectionInfos({name:"datas"})

and

db.datas.explain(true).aggregate([
    { $sort: { _id: 1 } },
    { $limit: 20 }
], {
    collation: {
        locale: 'zh'
    }
});

Thank you in advance!

Comment by zhang mengzhi [ 26/Nov/18 ]

version 3.6.8

db.datas.aggregate([
    { $sort: { _id: 1 } },
    { $limit: 20 }
], {
    collation: {
        locale: 'zh'
    }
});

it is very slow with option "collation.locale".

Generated at Thu Feb 08 04:48:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.