[SERVER-5715] Combinatorial limit of $in affects advanced querying Created: 26/Apr/12  Updated: 06/Dec/22  Resolved: 16/Aug/19

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.0.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Peter Lundberg Assignee: Backlog - Query Team (Inactive)
Resolution: Done Votes: 3
Labels: query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query
Operating System: ALL
Participants:

 Description   

We have a case of searching for changed content as simplified in the example below. If however a 'power user' chooses many groups/contexts to participate in and has advanced filtering on prefered tags we get "Combinatorial limit of $in partitioning of result set exceeded". Could this not be optimized differently or at least have a fallback stratagy (filtering the result of the index/shard lookup for the least selective of the $in conditions? Or do you have some other suggestions?

Perhaps related to SERVER-4778 as in this case we get just over a 1M and perhaps a higher limit is more suitable.

Example query that fails with

{code : 13385}

. This is simplified and does not show limit and ordering. refered fields are ensured index in the correct order to normally work fast. We do not shard. The application has CMS type data and the search applies the user's graned roles, subscribed contexts and tags:

db.content.find( { grant_read_ids:

{ '$in': [ "ALL", "1", "118189", "7", "161990", "162060", "162068", "166651", "166917", "161988", "167482", "167493", "167734", "166946", "167253", "167979", "167696", "167981", "167255", "168647", "168648", "166841", "168835", "168837", "170147", "170434", "169597", "170805", "171277", "170485", "171631", "171556", "166942", "176623", "171320", "180642", "181904", "181943", "182273", "182777", "182937", "184197", "184196", "184547", "159112", "155276", "161246", "157318", "160939", "153486", "157272", "166267", "158810", "155087", "154227", "160254", "156399", "168140", "155647", "158005", "161454", "156569", "159250", "153454", "157032", "153295", "154226", "157098", "155570", "154933", "160553", "158639", "154222", "159752", "159162", "155667", "161232", "157209", "158637", "159766", "158531", "154378", "154574", "167851" ] }

,
context_ids:

{ '$in': [ 2909, 155854, 809, 2901, 155860, 2249, 830, 11, 110, 993, 156565, 156791, 156736, 157150, 156819, 155861, 2833, 156289, 156937, 155830, 157094, 158201, 2247, 158374, 158375, 159148, 156715, 159469, 159659, 155829, 159849, 159901, 94, 66, 159939, 157092, 159629, 159116, 157179, 160258, 64, 160696, 160697, 159988, 160773, 161046, 160619, 160969, 156815, 162921, 177, 159156, 2484, 159824, 465, 156814, 165927, 157004, 160739, 156071, 169918, 160340, 160385, 26, 1158, 171155, 2356, 171193, 171195, 2430, 2770, 170027, 171702, 171516, 172018, 159376, 172177, 155828, 753, 169683, 166, 1621, 173426, 173201, 173722, 652, 2139, 2501, 2216, 2252, 2513, 1116, 2807, 1557, 1249, 2147, 1798, 1460, 2467, 615, 1723, 2318, 2473 ] }

,
last_activity_time:

{ '$exists': true }

,
tag_ids:

{ '$in': [ "ALL", "27", "8", "5", "7", "26", "5201", "5201", "4240", "4241", "4234", "4235", "4236", "4238", "4239", "4774", "4775", "4776", "4777", "4778", "4779", "4780", "295", "293", "294", "709", "5338", "301", "1179", "541", "2640", "1897", "5391", "3595", "5421", "5422", "5450", "5451", "5623", "5624", "543", "8673", "5763", "5765", "5869", "6252", "6870", "8869", "6922", "8532", "8533", "7124", "2703", "6320", "5768", "8799", "8872", "5965", "6750", "5506", "6750", "8918", "8919", "8920", "8439", "1488", "6957", "1", "5582", "6519", "6519", "8789", "8080", "6412", "7873", "4726", "2309", "1346", "5698", "5827", "6005", "7805", "7923", "8896", "9261", "9293", "1302", "9261", "9293", "7152", "7788", "7536", "7139", "8861", "9269", "9207", "9269", "7624", "9282", "7633", "9186", "5761", "8803", "6549", "9195", "7793", "9290", "9291", "100", "9233", "9230", "7789", "9195", "449", "7391", "6472", "6813", "8677", "8796", "8988", "9123", "4420", "5452", "6344", "9462", "6485", "8105", "6546", "8805", "6545", "8805", "8901", "6670", "5984", "8784", "7997", "8988", "8043", "8938", "7532", "8938", "7993", "8073", "7697", "7369", "1967", "7726", "8961", "8961", "9050", "10068", "10355", "11177", "11203", "11204", "11251", "11248", "11249", "11250", "11253", "11254" ] }

} ).count()



 Comments   
Comment by David Storch [ 16/Aug/19 ]

In recent versions (at least versions since 2.6, to my knowledge), $in queries will not fail due to reaching a combinatorial limit. I believe this issue has been addressed by refactors to the query planning code. Closing as "Gone Away".

Comment by siddharth.singh@10gen.com [ 14/Dec/12 ]

Hi George,

This ticket is not scheduled with a release yet. Unfortunately, I cannot predict when this might get tied to a release either. We take a number of factors including ,but not limited to, the number of up votes and the community feedback to prioritize on server tickets. I suggest that you up vote on the ticket and add yourself to the watch list to receive notifications when this ticket gets scheduled.

Thanks.

Comment by George P. [ 14/Dec/12 ]

Hi Siddharth, is there any update on this?

Comment by Peter Lundberg [ 28/Apr/12 ]

Ok, a workaround if others get into this problem is to use a regex. Note this affects performance quite a bit. Eg

 
if len(tag_ids) > 20:
    spec['tag_ids'] = {'$regex': '(%s)' % '|'.join(tag_ids) }
else:
    spec['tag_ids'] = {'$in': tag_ids }

Comment by siddharth.singh@10gen.com [ 26/Apr/12 ]

We are looking into options around suitably increasing these limits and other optimizations/fallback strategy but its not scheduled yet.

Generated at Thu Feb 08 03:09:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.