[SERVER-31447] Ensure change stream update lookup uses the correct collation Created: 06/Oct/17  Updated: 30/Oct/23  Resolved: 15/Nov/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework, Replication
Affects Version/s: None
Fix Version/s: 3.6.0-rc5, 3.7.1

Type: Bug Priority: Major - P3
Reporter: Charlie Swanson Assignee: Charlie Swanson
Resolution: Fixed Votes: 0
Labels: bkp
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-31442 $changeStream pipelines should inheri... Closed
related to SERVER-31443 $changeStream pipelines should suppor... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.6
Sprint: Repl 2017-10-23, Repl 2017-11-13, Repl 2017-12-04
Participants:

 Description   

When doing the updateLookup as part of a change stream, it is important both to use the simple collation to target the lookup when the collection is sharded and to use the collection-default collation on the shard when the collection is not sharded and the update lookup is executing on the mongod itself.

In order to target only one shard, we need to have the simple collation and an exact match on the shard key. When we get to the shard, it is always correct to use the simple collation, but it may not be able to use the _id index if the collection has a non-simple default collation.

If possible, the ideal world on mongos would be to target using the simple collation, then send the aggregate responsible to the lookup without specifying a collation, allowing it to inherit the collection's default collation. The desired behavior on mongod is to always use the collection's default collation for the lookup.



 Comments   
Comment by Githook User [ 15/Nov/17 ]

Author:

{'name': 'Charlie Swanson', 'username': 'cswanson310', 'email': 'charlie.swanson@mongodb.com'}

Message: SERVER-31447 Use correct collation for update lookup

(cherry picked from commit e37db69674486dff9fdac2b5ee41961a8805804b)
Branch: v3.6
https://github.com/mongodb/mongo/commit/f3fa1f280c22e6a93b610deae892ce001b02d106

Comment by Githook User [ 15/Nov/17 ]

Author:

{'name': 'Charlie Swanson', 'username': 'cswanson310', 'email': 'charlie.swanson@mongodb.com'}

Message: SERVER-31447 Use correct collation for update lookup
Branch: master
https://github.com/mongodb/mongo/commit/e37db69674486dff9fdac2b5ee41961a8805804b

Comment by Charlie Swanson [ 15/Nov/17 ]

Looks like I just typo-ed in the blacklist and failed to double check that it was actually working. I did add a jstests/sharding/change_stream_update_lookup_collation.js, but not a jstests/sharding/change_stream*s*_update_lookup_collation.js.

Fixed locally, verifying burn_in_tests now runs and it's excluded from the last stable mongos suite before pushing again.

Comment by Max Hirschhorn [ 15/Nov/17 ]

I've reverted the changes from the de0b160 on the master branch and the changes from 347535f on the v3.6 branch because it wasn't clear what the intention with the jstests/sharding/change_streams_update_lookup_collation.js blacklist entry is. Per a Slack conversation, Charlie wasn't sure if there's a test that failed to be added to the jstests/sharding/ directory, or if the blacklist entry was added in error given that there's a test with the same basename in the jstests/noPassthrough/ directory.

Comment by Githook User [ 15/Nov/17 ]

Author:

{'name': 'Max Hirschhorn', 'username': 'visemet', 'email': 'max.hirschhorn@mongodb.com'}

Message: Revert "SERVER-31447 Use correct collation for update lookup"

This reverts commit 347535f06861412f52c81bbf260fe253f0bf2041.
Branch: v3.6
https://github.com/mongodb/mongo/commit/c0cb7856ff26ca60f197e984ae7b9f3ba5cc64fe

Comment by Githook User [ 15/Nov/17 ]

Author:

{'name': 'Max Hirschhorn', 'username': 'visemet', 'email': 'max.hirschhorn@mongodb.com'}

Message: Revert "SERVER-31447 Use correct collation for update lookup"

This reverts commit de0b16077945eb6b6ec161b99f41c3222aade3b8.
Branch: master
https://github.com/mongodb/mongo/commit/b683d39549b033a516c1c8bdbae7040eafe99266

Comment by Githook User [ 14/Nov/17 ]

Author:

{'name': 'Charlie Swanson', 'username': 'cswanson310', 'email': 'charlie.swanson@mongodb.com'}

Message: SERVER-31447 Use correct collation for update lookup
Branch: v3.6
https://github.com/mongodb/mongo/commit/347535f06861412f52c81bbf260fe253f0bf2041

Comment by Githook User [ 14/Nov/17 ]

Author:

{'name': 'Charlie Swanson', 'username': 'cswanson310', 'email': 'charlie.swanson@mongodb.com'}

Message: SERVER-31447 Use correct collation for update lookup
Branch: master
https://github.com/mongodb/mongo/commit/de0b16077945eb6b6ec161b99f41c3222aade3b8

Comment by David Storch [ 09/Oct/17 ]

That sounds right. It boils down to using the simple collation for targeting but inheriting the collection default collation for actually doing the lookup (so that you can do a point lookup on the _id index).

Comment by Tess Avitabile (Inactive) [ 06/Oct/17 ]

LGTM

Comment by Charlie Swanson [ 06/Oct/17 ]

tess.avitabile, david.storch, and spencer - does this description look correct to you?

Generated at Thu Feb 08 04:27:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.