Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-82815

Expose server’s index key creation via aggregation

    • Query Execution
    • Fully Compatible
    • v7.2, v7.1, v7.0, v6.0, v5.0, v4.4
    • QE 2023-12-11, QE 2024-01-08, QE 2024-01-22

      Overview

      Currently aggregations can only set a single collation over the entire pipeline. This makes some sense for aggregations that originate from collections, but it’s more problematic for change streams that span multiple collections (as, e.g., mongosync uses). It's quite easy to have a data consistency problem if the client forgets/overlooks that string comparisons in such a change stream are simple-collated, regardless of the respective collections’ default collations.

      REP-3312 was such a problem. This prompted a Critical Advisory for mongosync, which led (in part) to the present [Migration & Backup Correctness|INIT-532] initiative, which includes mongosync’s current [collation-fixes epic|REP-3672].

      This task proposes to facilitate a fix for this by exposing the server’s internal index key via an aggregation operator, which I’ll tentatively call $_internalIndexKey. This operator would look thus:

      { $_internalIndexKey: {
          input: "abc", // … but can be any arbitrary BSON value
          collation: { locale: "en", strength: 1 },
      } }
      

      … and would output, as a binary blob, the index key that the server would create for that string & collation.

      This will facilitate REP-3312’s fix.

      Numeric Types

      As a convenience, this also envisions that the $_internalIndexKey operator will normalize numeric types. Thus, mongosync will have an easy way to tell via aggregation that { $numberLong: 42 } and { $numberDouble: 42 } are, in fact, the same number. See comments and linked tickets for context on how this helps us.

      Rejected Alternatives

      See REP-3672’s (in-progress) technical design for a list of considered alternative solutions.

      See SERVER-84198 for an additional request to facilitate full collation support with document filtering in mongosync.

            Assignee:
            rui.liu@mongodb.com Rui Liu
            Reporter:
            felipe.gasper@mongodb.com Felipe Gasper
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: