[SERVER-64021] Lookup stage poor performance Created: 26/Feb/22 Updated: 17/Mar/22 Resolved: 17/Mar/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework, Performance |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Labros Papadopoulos | Assignee: | Edwin Zhou |
| Resolution: | Duplicate | Votes: | 5 |
| Labels: | Aggregation, Lookup, Performance | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows 10 |
||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: | Create a collection with 100K documents. Create a collection with any number of documents. Join the second collection on the first collection. |
||||||||||||
| Participants: | |||||||||||||
| Description |
|
We switch from a SQL database to MongoDb for our latest cloud application and the performance benefits we received were spectacular. We also followed the MongoDb guidelines and used document embedding when possible. Currently none of our standard CRUD operations require any form of join and are all getting executed incredibly fast (our data sets range from 100K to 2M documents).
Howerver, in order for us to generate certain reports we need to perform certain joins between collections. The MongoDb C# driver that we are using translates the LINQ join query to a lookup stage. The lookup stage performs a search to the joined collection for every entry of the initial collection. We knew that the "join" performance was lacking on MongoDb, and should be avoided, but requiring 20 seconds to join a collection with 200K documents with a collection with 100K documents was totally unexpected.
This is a known issue and is acknowledged by the MongoDb team 7 years ago and 2 solutions were proposed. [SERVER-21312|SERVER-21312 $lookup should batch query requests - MongoDB Jira] [SERVER-21284|SERVER-21284 $lookup should cache query results - MongoDB Jira]
We would like to know if you are planning to resolve this issue or if this issue is not possible to be resolved (in order for us to try to resolve it using external methods). Granted that this issue has such a devastating performance impact and has not been resolved for almost 7 years we are lead to believe that it's the later case.
Thanks in advance! |
| Comments |
| Comment by Edwin Zhou [ 17/Mar/22 ] |
|
Thank you for your report. I understand that the performance impact of $lookup can be devastating. As you mentioned, we currently are tracking improvements to $lookup through SERVER-21312 and SERVER-21284. The best place to track these improvements will be on those tickets and I encourage you to watch and vote on those tickets. I will close this ticket as a duplicate of SERVER-21312 and SERVER-21284. If you need further assistance troubleshooting your performance issues, I encourage you to ask our community for help by posting on the MongoDB Developer Community Forums. Best, |
| Comment by Dimitris Kalantzis [ 01/Mar/22 ] |
|
We had similar performance issues on a cloud project as well. We had to use a different db for that kind of procedures. We would love to go full mongo..watching this.. |