[SERVER-25865] $group operation is slow since MongoDB 3.2 on Windows Created: 30/Aug/16 Updated: 17/Jan/17 Resolved: 09/Sep/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 3.2.9, 3.3.12 |
| Fix Version/s: | 3.2.12, 3.3.14 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Linda Qin | Assignee: | David Storch |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Requested: |
v3.2
|
||||||||||||
| Sprint: | Query 2016-09-19 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
The $group operation is much slower for MongoDB 3.2/3.3 comparing to MongoDB 3.0 on Windows. I don't see the issue on OSX or Linux.
From the diagnostic data, there is "cursor open pinned" while the aggregation command is run, but I don't see the same on OSX. Is this the cause of the slowness on Windows? Diagnostic data is attached. We've also tested the same aggregation on MongoDB 3.2 with MMAP storage engine, it is also slow. So this issue doesn't seem to relate to the storage engine. Also, if I change the data set from:
To:
The aggregation is faster on the second data set (both on MongoDB 3.2 on Windows):
It seems the $group operations would be slow if the result set is large, and this is more obvious on MongoDB 3.2 on Windows. |
| Comments |
| Comment by Githook User [ 17/Jan/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 09/Sep/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 09/Sep/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: On Windows, these are aliases for boost containers. On | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 09/Sep/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 01/Sep/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Could this be affecting other uses of unordered_map on Windows as well? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 31/Aug/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I confirmed definitively that the switch to std::unordered_map caused the issue. The issue also continues to affect 3.3.x development versions, despite our upgrade to VS2015. 3.3.x versions compiled with VS2015, just like the 3.2.x versions as reported in this ticket, take more than 20 seconds to execute the $group. I applied the patch below to f41c549e10, a recent version of the master branch somewhere in between 3.3.12 and 3.3.13:
All this patch does is change std::unordered_map back to boost::unordered_map in DocumentSourceGroup. With these changes, the performance increases drastically, back to the sub-100ms times we were seeing on 3.0. We suspect the root cause is that the VC implementation of std::unordered_map uses a naive policy for obtaining the hash table bucket ID from the hash itself, which leads to a large number of collisions. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by John Murphy [ 31/Aug/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I have been able to confirm that the $group operation does indeed slow down between 3.1.5 and 3.1.6. Specifically the build with git version 7e6df189868 ran the aforementioned group aggregation test in 60 ms while the build with git version 932e768dc26 took 23 secs. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 30/Aug/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The query team reviewed together and came up with a hypothesis: 932e768dc26 changed the type of the in-memory structure used to construct the groups from boost::unordered_map<> to std::unordered_map<> in 3.1.6. Perhaps boost::unordered_map is faster on Windows than std::unordered_map? In order to confirm we could test this commit and its parent commit in order to determine if this is where the regression was introduced. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 30/Aug/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
EDIT: the issue on | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 30/Aug/16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Periodic stack trace samples show that during the time the aggregate command is executing, marked by the interval between the blue lines below, it is spending all its time in various places in unordered_map::_Try_emplace.
|