[SERVER-84198] Facilitate multiple collations within the same change stream. Created: 14/Dec/23 Updated: 26/Dec/23 Resolved: 26/Dec/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Felipe Gasper | Assignee: | Backlog - Query Execution |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Query Execution
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Mongosync applies document queries in two contexts: The initial-sync queries are per-collection and so use each collection's default collation. The change stream, though, is multi-collection, so it's simple-collated. Thus, if we search on "_id > aaa && _id < zzz" we'll match _id=BBB during initial sync but not in the change stream.
This problem worsens in the context of [document filtering|REP-1954], where the query will come from the customer. Here we either have to limit the scope of support for strings in queries pretty dramatically or implement some sort of query-transform logic based on We can soften the problem somewhat by having customers migrate like-collated collections in concurrent mongosync sessions. Given limitations on the # of concurrent change streams, though, this won't scale well to multi-tenant setups where dozens, even hundreds, of collations may coexist on a given source cluster. It seems that, ultimately, we can't "gracefully" support collations without some ability to apply multiple collations in a given change stream. |