[SERVER-24714] $graphLookup should be able to apply a filter to the documents it traverses Created: 22/Jun/16  Updated: 03/May/17  Resolved: 05/Jul/16

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 3.3.10

Type: Improvement Priority: Major - P3
Reporter: Benjamin Murphy Assignee: Benjamin Murphy
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-7795 Document $graphLookup Closed
is documented by DOCS-8913 $graphLookup should be able to apply ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Query 16 (06/24/16)
Participants:

 Description   

As designed, $graphLookup is only capable of traversing a graph recursively, that is, by performing repeated queries on the "connectToField" value. However, for a densely connected graph, this will result in an immense result set, often exceeding the maximum memory usage. In many cases, the user would only like to consider nodes that meet some filter during their search. $graphLookup should support an option that accepts a filter in MatchExpression syntax, and is applied at each stage of the breadth-first search.

For example, consider a set of flight data, where each document represents a flight between two cities:

{ from: 'JFK', to: 'MIA' , airline: 'Delta' }
{ from: 'MIA', to: 'SFO', airline: 'United' }
{ from: 'JFK', to: 'BOS', airline: 'JetBlue' }
...

If a user wishes to figure out all cities they can fly to from New York within two stops, but only on United Airlines flights, they are forced to perform a $graphLookup on the flight origin and destination:

{$graphLookup: {
    from: 'flights',
    startWith: {$literal: 'JFK'},
    connectToField: 'from',
    connectFromFIeld: 'to',
    as: 'destinations'
}}

And afterwards, $unwind the result set, and then perform a $match with:

{$match: {'airline': 'United'}}

However, this will traverse the entire set of flights, when really, the user wanted to specify that $graphLookup should only consider nodes that match the aforementioned filter. In this case, the user should be able to specify:

{$graphLookup: {
    from: 'flights',
    startWith: {$literal: 'JFK'},
    connectToField: 'from',
    connectFromFIeld: 'to',
    restrictSearchWithMatch: {'airline': 'United'},
    as: 'destinations'
}}

With this syntax, $graphLookup will only consider flights on United, making both the query faster, and obviating the need for a following $unwind and $match.

It is worth noting that this filter is not precisely equivalent to unwinding the result set and applying a filter to that. Consider a graph composed of two subgraphs, each consisting entirely of United Airlines flights, where the subgraphs are connected only by Delta flights. A $graphLookup of the first form starting anywhere would find all the flights, and the $unwind/$match would reduce it to the United flights. However, with 'restrictSearchWithMatch', the $graphLookup will explore whichever subgraph it begins in, but not the other, since it would need to traverse a Delta edge to get there.



 Comments   
Comment by Githook User [ 05/Jul/16 ]

Author:

{u'username': u'benjaminmurphy', u'name': u'Benjamin Murphy', u'email': u'benjamin_murphy@me.com'}

Message: SERVER-24714 graphLookup now accepts the restrictSearchWithMatch option.

Closes #1098

Signed-off-by: David Storch <david.storch@10gen.com>
Branch: master
https://github.com/mongodb/mongo/commit/45977b4209fd2c65a8df660d09bcef8206780ec2

Generated at Thu Feb 08 04:07:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.