[SERVER-4745] Figuring out which shard to send a query to takes a long time when doing large $in queries on the shard key Created: 23/Jan/12  Updated: 11/Jul/16  Resolved: 10/Feb/12

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 1.8.4, 2.0.2
Fix Version/s: 2.0.3, 2.1.1

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Spencer Brody (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-4555 check and clean sharding utilization ... Closed
Related
is related to SERVER-4960 Optimize shard selection on $in queries Backlog
Operating System: ALL
Participants:

 Description   

The problem seems to be in ChunkManager::getShardsForQuery in chunk.cpp. Seems to do n^2 comparisons to figure out where to send the queries.



 Comments   
Comment by auto [ 10/Feb/12 ]

Author:

{u'login': u'stbrody', u'email': u'spencer@10gen.com', u'name': u'Spencer T Brody'}

Message: Add unit test for SERVER-4745.
Branch: master
https://github.com/mongodb/mongo/commit/550c8441917b58b5f09367ee28765c5bf6c0ec0a

Comment by auto [ 10/Feb/12 ]

Author:

{u'login': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@10gen.com'}

Message: Calculating which shard(s) to send $in queries to was taking a long time. This fix changes mongos to stop limiting the shards to send to after the first $in clause - possibly sending the query to more shards than necessary, but saving time. SERVER-4745.
Branch: v2.0
https://github.com/mongodb/mongo/commit/dd046106ba1c8de1c1ae8f5830981e18ce3dd597

Comment by auto [ 09/Feb/12 ]

Author:

{u'login': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@10gen.com'}

Message: Calculating which shard(s) to send $in queries to was taking a long time. This fix changes mongos to stop limiting the shards to send to after the first $in clause - possibly sending the query to more shards than necessary, but saving time. SERVER-4745.
Branch: master
https://github.com/mongodb/mongo/commit/010488f6351acd460dcdcaad1919151e6bb50fa2

Comment by Remon van Vliet [ 30/Jan/12 ]

Eliot pointed me to those issue; we're getting "combinatorial limit of $in partitioning of result set exceeded" errors with queries on large sharded collections with a simple (one field) shard key but multiple $in clauses. I do not see a reason for the error I'm getting given my configuration but I'll keep an eye on this regardless. Full reference here : http://groups.google.com/group/mongodb-user/browse_thread/thread/d86f1246c5ed4816/73c522eee341ef98#73c522eee341ef98

Comment by Ben [ 24/Jan/12 ]

Makes sense, the level of effort should probably be reasonable given the number of shards as well? In our case, with only four shards, we expect these large $in queries to hit all of them.

Comment by Spencer Brody (Inactive) [ 24/Jan/12 ]

Proposed solution: if there are multiple $in clauses in components of the shard key, only check all shard key combinations up to (and including) the first field with a $in. So in the above example, you would check all values of "a" in "aList", but you wouldn't take the cartesian product of values in "aList" with the values in "bList". This may result in sending the query to more shards than is necessary, but that will be worth it to avoid calculating the cartesian product of all values in the two (or more) $in lists.

Comment by Spencer Brody (Inactive) [ 24/Jan/12 ]

Seems to be a problem specifically when the shard key is compound and you're doing a query with a $in on multiple parts of the shard key. For example, if the shard key is

{a:1,b:1}

, and you do a query like

db.foo.find({a : {$in : aList}, b : {$in : bList}})

, the mongos takes a very long time to figure out which shard(s) to send that query to. It has to check every combination of the elements of aList and bList, so if aList is of length n and bList is of length m, then it must generate all n*m possible shard keys, then check each of them to find what shard(s) they hit.

Generated at Thu Feb 08 03:06:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.