[DRIVERS-757] Ability to specify union Created: 21/Oct/19  Updated: 28/Oct/23  Resolved: 30/Apr/20

Status: Closed
Project: Drivers
Component/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Backlog - Core Eng Program Management Team Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on JAVA-3472 Ability to specify union Closed
depends on CSHARP-2805 Ability to specify union Closed
Server Compat: 4.4
Driver Compliance:
Key Status/Resolution FixVersion
JAVA-3472 Fixed 4.1.0
CSHARP-2805 Fixed 2.11.0

 Description   
Downstream Change Summary

null

Description of Linked Ticket

Epic Summary

Summary

We will implement a new Agg stage, $union, that allows to merge results of n pipelines preserving duplicates. In order to enable merging data from multiple collections, we will also introduce an explicit stage to reference a collection, $collection.

Motivation

Union is a fundamental operation in relational algebra. We have several specific scenarios:

  • BIC connector for completeness with SQL.
  • TimeSeries scenario to combine data stored in per-period collections into one logical collection.
  • Combining collections in Data Lake, e.g. archival and recent data, data from different regions.

For analytical scenarios, customers expect a complete set of fundamental operations. For example for Tableau, union and unpivot were top requested features after joins. In the future, we will be improving $lookup but delivering general and performant joins is a hard task. At the same time, union-like logic is already supported for operations that require merging results across shards in the backend.

Documentation

Scope Document
Design Document


Generated at Thu Feb 08 08:22:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.