[SERVER-20336] O(N^2) perf regression in listCollections and similar code paths [BLOCKING Mongo 3.0 Adoption] Created: 09/Sep/15 Updated: 23/Oct/15 Resolved: 09/Sep/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 3.0.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker - P1 |
| Reporter: | Michael Lehenbauer | Assignee: | Ramon Fernandez Marina |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
|||||||||||||
| Issue Links: |
|
|||||||||||||
| Operating System: | ALL | |||||||||||||
| Steps To Reproduce: | Run the following script using a mongo3 shell against a mongo3 mongod. I've reproduced against 3.0.4, 3.0.5, and 3.0.6.
You should see a time of ~15-30 seconds for the getCollectionNames() call, and it gets much worse as N increases, since it's O(N^2). If you run the same script on mongo 2.6, it will complete in < a second, even for large values of N. You'll also see a hang if you restart your mongod server, or do a mongodump, and probably many other operations. While db.getCollectionNames() is in progress, any writes will be blocked. |
|||||||||||||
| Participants: | ||||||||||||||
| Description |
|
Mongo 3 has an O(N^2) perf issue where N is the number of collections in a database. This is a regression from 2.x. For our dataset this causes a ~15 minutes hang, making mongo 3 completely unusable. The hang can be hit in many ways, including: The O(N^2) nature can be clearly seen by this chart showing measured time to perform a db.getCollectionNames() for a given number of collections. There's also an attached graph showing a quadratic best fit.
Context:
Can you please acknowledge this bug and provide an estimate for when it can be fixed and released? |
| Comments |
| Comment by Ramon Fernandez Marina [ 23/Oct/15 ] |
|
Thanks for reporting back katfang, glad to hear you're no longer seeing listCollections performance issues on MMAPv1 after Regards, |
| Comment by Katherine Fang [ 21/Oct/15 ] |
|
Hi Ramon, Just following up. We've tested out list collections with 3.0.7 and it seems much faster. Thanks again for the fix. |
| Comment by Michael Lehenbauer [ 08/Oct/15 ] |
|
Thanks Ramon! I haven't gotten a chance to take a look (bit swamped at the moment), but will let you know once I do. Thanks for getting the fix through! -Michael |
| Comment by Ramon Fernandez Marina [ 06/Oct/15 ] |
|
mikelehen@google.com, this is to let you know that we've released release candidate 3.0.7-rc0 today, which includes a fix for this issue (see Thanks, |
| Comment by Michael Lehenbauer [ 11/Sep/15 ] |
|
Thanks Ramon! That will do nicely for us. Thanks for following up, |
| Comment by Ramon Fernandez Marina [ 10/Sep/15 ] |
|
mikelehen, after internal discussion we've scheduled If I understand correctly this issue only affects the MMAPv1 storage engine, so one option you may consider is switching to the WiredTiger storage engine offering, among other features, data compression, which may be of interest for multi-tenant users. Regards, |
| Comment by Michael Lehenbauer [ 09/Sep/15 ] |
|
Thanks ramon.fernandez. That bug seems to be 4 months old and currently unassigned, yet this is a blocking issue for us. Can you clarify the timeline for which we could expect to see this fixed in the 3.0 branch? |
| Comment by Ramon Fernandez Marina [ 09/Sep/15 ] |
|
Thanks for the additional information mikelehen. We're aware of the behavior you describe and I'm going to mark this ticket as a duplicate of Regards, |
| Comment by Michael Lehenbauer [ 09/Sep/15 ] |
|
We're using mmap. Sorry for the omission. |
| Comment by Ramon Fernandez Marina [ 09/Sep/15 ] |
|
mikelehen, what storage engine are you using in 3.0? |