[SERVER-38198] Remove the requirement that $geoNear needs to be in the first stage of the pipeline Created: 18/Nov/18  Updated: 30/Jun/21  Resolved: 19/Dec/18

Status: Closed
Project: Core Server
Component/s: Aggregation Framework, Geo
Affects Version/s: 3.6.0
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Vinicius Gualberto Assignee: Asya Kamsky
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-9606 $cond operator should allow $match as... Backlog
related to SERVER-34766 Allow $expr or $$field in the $geoNea... Closed
Participants:
Case:

 Description   

I am trying to do a aggregation with several stages including a $lookup and a $geoNear and the requirement of it needing to be in the first is preventing of getting null or 0 results in a left outer join style of query. Is there a way to add this or anchieve what i need another way. 



 Comments   
Comment by Asya Kamsky [ 19/Dec/18 ]

Since the example given needs either SERVER-9606 or SERVER-34766 I'm closing this ticket.

If there is a use case that can't be done without removing this restriction feel free to reopen and include the full use case description.

Comment by Asya Kamsky [ 17/Dec/18 ]

I think if you want to count all statues and then also count just the ones that are within some geographical area, you need either SERVER-9606 or SERVER-34766.   I believe the latter is what you are going to run into if you try to write a pipeline in expressive $lookup that takes the coordinates from the parent pipeline.  If the coordinates are set by the application during the construction of the entire aggregation then you should be able to use expressive $lookup.

 

Comment by Vinicius Gualberto [ 08/Dec/18 ]

Hello, 

I just learned about the new pipeline option in $lookup, I will try to re-arrange my pipeline using this to lookup only the near ones, and I will let you know if it worked. 

Comment by Vinicius Gualberto [ 06/Dec/18 ]

>I think as soon as you filter by nearest you will once again eliminate the documents that aren't nearest, no? $geoNear is like $match in that it either passes the document through to the next stage or it doesn't, and you want to know in the document if it's near or not, and not to eliminate it, no?

Yes, but that is my problem. If i eliminate them, the counting would be done. That's why i said that if could start the aggregation from the place_tags collection and use the $geoNear after the first stage, i will be able to this, because the matches would be shown as 0 in the counting instead of stopping the pipeline and finishing the query stack. 

Comment by Asya Kamsky [ 29/Nov/18 ]

SERVER-34766 may also be related to your use case.

Comment by Asya Kamsky [ 29/Nov/18 ]

vinguan you say
> But if there is not places near that would suit the $geoNear the query returns no results, or the the tags with 0 places under it will not be shown, and i need them even with 0 in the "places_count".

True, but this doesn't have anything to do with $geoNear having to be first, the issue is that you want to determine proximity of a record during aggregation processing, you do not want to eliminate its document if it's not near something. This is similar to request in SERVER-9606 where if you could use a geo expression to get a boolean in something like $cond then you could do what you describe.

> If the $geoNear was possible to use at any stage in the pipeline i could use $lookup the places, filter by the nearest and after that count.

I think as soon as you filter by nearest you will once again eliminate the documents that aren't nearest, no? $geoNear is like $match in that it either passes the document through to the next stage or it doesn't, and you want to know in the document if it's near or not, and not to eliminate it, no?

Comment by Vinicius Gualberto [ 19/Nov/18 ]

Hello Daniel,

Let me show you similar structure of my schema. 

Collection "place_tags" with the following schema:

{
 "_id": "5bf32be87151e88548b9050e",
 "name": "Statues"
 }

 

Collection "places" with the following schema :

{
 "_id": "5bf32bfa7151e88548b90513",
 "name": "Statue of Liberty",
 "place_tags_ids": [
 "5bf32be87151e88548b9050e"
 ],
 "location": { "type": "Point", "coordinates": [ -74.0445, 40.689469 ] }
}

 

What i need is a result with something like :

{
 "_id" : "5bf32be87151e88548b9050e",
 "name": "Statues", 
 "places_count" : 151054
 }
 ...
 {
 "_id" : 5bf32be87151e88548b9050e"
 "name": "Airports", 
 "places_count" : 45893
 }

 

Then i am doing this aggreagation pipeline :

db.places.aggregate([
 {
 $geoNear: {
 near: { type: 'Point', coordinates: [-74.0445, 40.689469] },
 distanceField: 'dist.calculated',
 spherical: true,
 maxDistance: 50000,
 num: 100
 }
 },
 {
 $lookup: {
 from: 'place_tags',
 localField: 'place_tags_ids',
 foreignField: '_id',
 as: 'tags'
 }
 },
 {
 $group: {
 _id: 'tags',
 places_count: { $sum: 1 }
 }
 },
 {
 $project: {
 _id: 0,
 id: '$_id._id',
 name: '$_id.Name',
 places_count: '$places_count'
 }
 }
]);

So i have the result which is close, but i can adjust in the application side, (I am using C#) :

{
 "Id" : "5bf32be87151e88548b9050e",
 "name": "Statues", 
 "places_count" : 151054
 }

But if there is not places near that would suit the $geoNear the query returns no results, or the the tags with 0 places under it will not be shown, and i need them even with 0 in the "places_count".
If the $geoNear was possible to use at any stage in the pipeline i could use $lookup the places, filter by the nearest and after that count.
I hope i made it more clear now.

Comment by Danny Hatcher (Inactive) [ 19/Nov/18 ]

Hello Vinicius,

If you provide examples of your document structure and what you are trying to achieve with your aggregations, I can try to help.

Thank you,

Danny

Generated at Thu Feb 08 04:48:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.