[SERVER-24534] Commands that accept user predicates should use collection default collation Created: 13/Jun/16  Updated: 13/Aug/16  Resolved: 27/Jul/16

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 3.3.11

Type: Improvement Priority: Major - P3
Reporter: J Rassi Assignee: Max Hirschhorn
Resolution: Done Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-25295 Add cloning commands to writeConcern-... Closed
Related
related to SERVER-23611 Query planner should set collation fr... Closed
Backwards Compatibility: Fully Compatible
Sprint: Query 18 (08/05/16)
Participants:

 Description   

The following commands do not accept a user collation, but the predicates they accept should respect the default collation (if the collection has one):

  • cloneCollection
  • currentOp
  • listCollections


 Comments   
Comment by Githook User [ 27/Jul/16 ]

Author:

{u'username': u'visemet', u'name': u'Max Hirschhorn', u'email': u'max.hirschhorn@mongodb.com'}

Message: SERVER-24534 Add tests for cloning collections w/ non-simple collations.

Tests that the "cloneCollection", "cloneCollectionAsCapped",
"convertToCapped", and "copydb" commands inherit the default collation
of the corresponding collection.
Branch: master
https://github.com/mongodb/mongo/commit/541d4ee893321e69a6b065908098c3b07f180de0

Comment by David Storch [ 20/Jul/16 ]

I think the "stageDebug" command is the only one not mentioned already that we could consider having inherit the default collation of the collection. David Storch, do you think that would be useful for any testing we plan to do?

Since stageDebug is test only, there is no need to have it inherit the default collation now. If in the future we find this to be useful for writing test cases, we can do it at that time.

The "geoSearch" command shouldn't use the default collation of the collection because the geoHaystack index must use the simple collation. We could change the command to accept a collation and require callers do {locale: "simple"} in order to use the command, but I don't see that as being necessary.

Agreed, we should leave as is.

We could change the command to accept a collation and require callers do {locale: "simple"} in order to use the command, but I don't see that as being necessary either.

Agreed, I think chunk management operations should always use whatever collation the shard key has, so I don't see a future in which we will need a collation parameter on these commands.

Comment by Max Hirschhorn [ 20/Jul/16 ]

As part of this audit, I looked through the list of commands supported by MongoDB and searched for callers of MatchExpressionParser::parse() as well as users of the Matcher class. I think the "stageDebug" command is the only one not mentioned already that we could consider having inherit the default collation of the collection. david.storch, do you think that would be useful for any testing we plan to do?

  • The "cloneCollection" command already respects the default collation of the collection by virtue of sending a query either over a client connection or via DBDirectClient to fetch the desired set of documents.
  • The "geoSearch" command shouldn't use the default collation of the collection because the geoHaystack index must use the simple collation. We could change the command to accept a collation and require callers do {locale: "simple"} in order to use the command, but I don't see that as being necessary.
  • The "split" and "moveChunk" commands shouldn't use the default collation of the collection because the collection is allowed to have a non-simple collation, whereas the collection must be sharded using the simple collation. Equality matches on the shard key are extracted from the query predicate specified to the "split" and "moveChunk" commands in order to perform the shard targeting. We could change the command to accept a collation and require callers do {locale: "simple"} in order to use the command, but I don't see that as being necessary either. schwerin, do you agree?

Per my conversation with Dave, I'm inclined to convert this ticket into a task for testing that the "cloneCollection", "cloneCollectionAsCapped", "convertToCapped", and "copydb" commands all inherit the default collation of the collection when the associated collection is copied.

Comment by J Rassi [ 13/Jun/16 ]

Ah, of course. I've struck those above.

Comment by Andy Schwerin [ 13/Jun/16 ]

currentOp and listCollections aren't run over a collection, so I am not certain that there is a meaningful collection-default collation to use.

Comment by J Rassi [ 13/Jun/16 ]

I'm assuming that split, moveChunk, and geoSearch should not be included in the above list, but they should at least be given passing consideration when this ticket is triaged.

Generated at Thu Feb 08 04:06:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.