$merge is not consistent in enforcing uniqueness of `on` fields on a sharded collection

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The $merge stage attempts to verify uniqueness of the `on` fields by looking for a unique index, however for a sharded collection it should be sufficient to supply the shardKey + _id field. This is consistent with the behavior as if no `on` field was given.

      Previous description:

      Currently $merge checks {{ensureFieldsUniqueOrResolveDocumentKey}} at creation time only if an explicit {{on}} field is passed. If an explicit {{on}} field is not passed then the implied {{on}} field is either {{_id}} in the case that the collection is unsharded or {{shardKey + _id}} in the case that it is sharded. The code assumes the existence of the {{(shardKey +) _id}} index which is not true if {{autoIndexId = false}} is specified at collection creation time.
      
      We discovered this because viewless timeseries collections were making it all the way to the insert step despite the nonexistence of the unique index indicating that no other existence check is being performed for the unique index. Searching the codebase for references to {{ensureFieldsUniqueOrResolveDocumentKey}} using a language server also reveals no references outside of creation/parse time. We plan on resolving the unsharded case as part of SERVER-107433. This ticket is only intended to track the investigation of the sharded case.
      
      

            Assignee:
            Unassigned
            Reporter:
            Sam Mercier
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: