[SERVER-34427] Aggregation Query With $in for large Datasets Index not considered via spark Created: 12/Apr/18  Updated: 23/Jul/18  Resolved: 21/Jun/18

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Badrinarayanan P Assignee: Kelsey Schubert
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

While passing large arguments in $in block like 100K to 200k for match pipeline the index is not considered. For the same query if we reduce the parameters in 1000 it is considering index.
We are running it in Spark 2.2 and mongo version 3.4.9 and code is written in scala to execute it as spark-submit command.



 Comments   
Comment by Kelsey Schubert [ 21/Jun/18 ]

Hi badrinarayanan_p@infosys.com,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Kelsey

Comment by Ramon Fernandez Marina [ 06/May/18 ]

Apologies for the long delay in getting back to you badrinarayanan_p@infosys.com. We're not aware of considering a different query plan depending of the number of elements in a $in clause, so can you please provide:

  • Output of the queries run with explain so we can see what the query planner is doing with small and large number of clauses
  • The log files (preferably at loglevel 1) during the time you run those queries?

Thanks,
Ramón.

Generated at Thu Feb 08 04:36:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.