[SERVER-35693] Parsing of $in takes quadratic time due to O(n^2) boost::flat_set constructor Created: 19/Jun/18  Updated: 29/Oct/23  Resolved: 21/Jun/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.2.19, 3.4.9, 3.6.5, 4.0.0-rc6
Fix Version/s: 3.2.21, 3.4.16, 3.6.6, 4.0.1, 4.1.1

Type: Bug Priority: Critical - P2
Reporter: David Storch Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
Related
related to SERVER-35699 Avoid holding locks during query parsing Backlog
related to SERVER-34012 Planner's logic for taking union of i... Closed
related to SERVER-35712 boost::flat_set constructor should be... Closed
related to SERVER-37176 IndexBoundsBuilder::unionize() can be... Closed
is related to SERVER-30189 Reduce calls to allocator for large $... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6, v3.4, v3.2
Sprint: Query 2018-07-02
Participants:
Case:
Linked BF Score: 0

 Description   

As of the fix for SERVER-30189, the $in parsing code makes use of a boost::flat_set in order to avoid the many small allocations required for a tree-based sorted set. This, however, results in O(n^2) worst-case time complexity for parsing an $in, since the boost::flat_set constructor is known to be quadratic:

https://svn.boost.org/trac10/ticket/13140

For $in predicates with many elements, this causes a substantial performance regression. It is observable in that a large $in where the elements are pre-sorted outperforms a large $in where the elements are random by orders of magnitude. (When the $in elements are pre-sorted, the boost::flat_set constructor always inserts at the end of its vector, resulting in O(n log n) runtime.)

We should either patch boost to improve the boost::flat_set constructor's performance, or work around the issue by pre-sorting the $in elements prior to invoking the constructor.



 Comments   
Comment by Githook User [ 29/Jun/18 ]

Author:

{'username': 'dstorch', 'name': 'David Storch', 'email': 'david.storch@10gen.com'}

Message: SERVER-35693 Pre-sort $in elements to avoid O(n^2) boost::flat_set() complexity.

(cherry picked from commit a6eae4759b792f86284006d475b00559fe96953c)
Branch: v4.0
https://github.com/mongodb/mongo/commit/ad4bc1393facc1988ad86d021e47f9b2f0928a15

Comment by Githook User [ 28/Jun/18 ]

Author:

{'username': 'dstorch', 'name': 'David Storch', 'email': 'david.storch@10gen.com'}

Message: SERVER-35693 Pre-sort $in elements to avoid O(n^2) boost::flat_set() complexity.
Branch: v3.2
https://github.com/mongodb/mongo/commit/ed5dccdc3ffe74a869c04ee31c61e61da0b0d0c3

Comment by Githook User [ 22/Jun/18 ]

Author:

{'username': 'dstorch', 'name': 'David Storch', 'email': 'david.storch@10gen.com'}

Message: SERVER-35693 Pre-sort $in elements to avoid O(n^2) boost::flat_set() complexity.

(cherry picked from commit 50c6a174e67cdef97b1877b5ce8916dc55725fd5)
Branch: v3.4
https://github.com/mongodb/mongo/commit/c50dd604f81a7464389ffa1ac5c7ba8f0b471d24

Comment by Githook User [ 22/Jun/18 ]

Author:

{'username': 'dstorch', 'name': 'David Storch', 'email': 'david.storch@10gen.com'}

Message: SERVER-35693 Pre-sort $in elements to avoid O(n^2) boost::flat_set() complexity.

(cherry picked from commit a6eae4759b792f86284006d475b00559fe96953c)
Branch: v3.6
https://github.com/mongodb/mongo/commit/50c6a174e67cdef97b1877b5ce8916dc55725fd5

Comment by Githook User [ 21/Jun/18 ]

Author:

{'username': 'dstorch', 'name': 'David Storch', 'email': 'david.storch@10gen.com'}

Message: SERVER-35693 Pre-sort $in elements to avoid O(n^2) boost::flat_set() complexity.
Branch: master
https://github.com/mongodb/mongo/commit/a6eae4759b792f86284006d475b00559fe96953c

Comment by David Storch [ 20/Jun/18 ]

The proposed fix for this ticket works around the issue in the query processing layer of the MongoDB server. In addition, we may wish to pursue a fix in the boost library itself. I have filed SERVER-35712 to track this additional work.

Generated at Thu Feb 08 04:40:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.