[DRIVERS-775] Ability to specify union Created: 22/Nov/19  Updated: 27/May/22  Resolved: 02/Jan/20

Status: Closed
Project: Drivers
Component/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Backlog - Core Eng Program Management Team Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on CSHARP-2863 Ability to specify union Closed
depends on JAVA-3520 Ability to specify union Closed
Server Compat: 4.4
Driver Compliance:
Key Status/Resolution FixVersion
JAVA-3520 Duplicate
CSHARP-2863 Duplicate

 Description   
Downstream Change Summary

TBD

Description of Linked Ticket

Epic Summary

Summary

We will implement a new Agg stage, $union, that allows to merge results of n pipelines preserving duplicates. In order to enable merging data from multiple collections, we will also introduce an explicit stage to reference a collection, $collection.

Motivation

Union is a fundamental operation in relational algebra. We have several specific scenarios:

  • BIC connector for completeness with SQL.
  • TimeSeries scenario to combine data stored in per-period collections into one logical collection.
  • Combining collections in Data Lake, e.g. archival and recent data, data from different regions.

For analytical scenarios, customers expect a complete set of fundamental operations. For example for Tableau, union and unpivot were top requested features after joins. In the future, we will be improving $lookup but delivering general and performant joins is a hard task. At the same time, union-like logic is already supported for operations that require merging results across shards in the backend.

Documentation

Scope Document
Design Document


Generated at Thu Feb 08 08:22:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.