[DOCS-8913] $graphLookup should be able to apply a filter to the documents it traverses Created: 28/Sep/16  Updated: 30/Oct/23

Status: Closed
Project: Documentation
Component/s: Server
Affects Version/s: None
Fix Version/s: Server_Docs_20231030

Type: Task Priority: Major - P3
Reporter: Emily Hall Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-24714 $graphLookup should be able to apply ... Closed
Participants:
Days since reply: 1 year, 14 weeks, 2 days ago
Epic Link: DOCSP-1769

 Description   

As designed, $graphLookup is only capable of traversing a graph recursively, that is, by performing repeated queries on the "connectToField" value. However, for a densely connected graph, this will result in an immense result set, often exceeding the maximum memory usage. In many cases, the user would only like to consider nodes that meet some filter during their search. $graphLookup should support an option that accepts a filter in MatchExpression syntax, and is applied at each stage of the breadth-first search.

For example, consider a set of flight data, where each document represents a flight between two cities:

{ from: 'JFK', to: 'MIA' , airline: 'Delta' }
{ from: 'MIA', to: 'SFO', airline: 'United' }
{ from: 'JFK', to: 'BOS', airline: 'JetBlue' }
...

If a user wishes to figure out all cities they can fly to from New York within two stops, but only on United Airlines flights, they are forced to perform a $graphLookup on the flight origin and destination:

{$graphLookup: {
    from: 'flights',
    startWith: {$literal: 'JFK'},
    connectToField: 'from',
    connectFromFIeld: 'to',
    as: 'destinations'
}}

And afterwards, $unwind the result set, and then perform a $match with:

{$match: {'airline': 'United'}}

However, this will traverse the entire set of flights, when really, the user wanted to specify that $graphLookup should only consider nodes that match the aforementioned filter. In this case, the user should be able to specify:

{$graphLookup: {
    from: 'flights',
    startWith: {$literal: 'JFK'},
    connectToField: 'from',
    connectFromFIeld: 'to',
    restrictSearchWithMatch: {'airline': 'United'},
    as: 'destinations'
}}

With this syntax, $graphLookup will only consider flights on United, making both the query faster, and obviating the need for a following $unwind and $match.

It is worth noting that this filter is not precisely equivalent to unwinding the result set and applying a filter to that. Consider a graph composed of two subgraphs, each consisting entirely of United Airlines flights, where the subgraphs are connected only by Delta flights. A $graphLookup of the first form starting anywhere would find all the flights, and the $unwind/$match would reduce it to the United flights. However, with 'restrictSearchWithMatch', the $graphLookup will explore whichever subgraph it begins in, but not the other, since it would need to traverse a Delta edge to get there.



 Comments   
Comment by Education Bot [ 31/Oct/22 ]

Hello! This ticket has been closed due to inactivity. If you believe this ticket is still important, please reopen it and leave a comment to explain why. Thank you!

Comment by Asya Kamsky [ 07/Feb/18 ]

https://docs.mongodb.com/manual/reference/operator/aggregation/graphLookup/index.html#with-a-query-filter has an example for "restrictSearchWithMatch" which is somewhat misleading.

Rather than opening a new ticket I'd like to use this one since it already shows an example with a more straight forward usage of match clause.

Comment by Kay Kim (Inactive) [ 14/Feb/17 ]

I believe this was done as DOCS-7795

Comment by Asya Kamsky [ 11/Nov/16 ]

This option already appears to be documented in the release notes. The rest of it should be tracked in the ticket to document $graphLookup stage which is DOCS-7795

Generated at Thu Feb 08 07:57:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.