[SERVER-44032] Mongodb Count is slow Created: 15/Oct/19 Updated: 06/Dec/19 Resolved: 06/Dec/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.2.0 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | virinchi samineni | Assignee: | Eric Sedor |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
Hi, Is this behavior expected or there is something I can do to improve this. Please find details below- There is collection with 6 million records consider each document to be something like this { "_id": "523453252", "type": "test" }i have a single field index on "type" when I execute following command the number of keys examined are ~5.9 million |
| Comments |
| Comment by Eric Sedor [ 06/Dec/19 ] | |
|
Hi, We haven’t heard back from you for some time, so I’m going to close this issue as a duplicate of SERVER-17266. If this is still an issue for you, please provide additional information and we will reopen the ticket. Sincerely, | |
| Comment by Eric Sedor [ 07/Nov/19 ] | |
|
Hi virinchisamineni@gmail.com, Are you are able to provide an exact command and explain() results via the MongoDB shell? We would still need that information to diagnose this issue as something other than SERVER-17266 Gratefully, | |
| Comment by Eric Sedor [ 25/Oct/19 ] | |
|
Just to keep in mind, we are still continuing with the assumption that this is a duplicate of SERVER-17266, and our workaround suggestion continues to be to issue separate queries for each $in clause. If you would like to continue trying to confirm this fully, we would need to see the explain plan generated in the mongodb shell. Please C&P both the command and the output. Screenshots make it very difficult for us to communicate about specific details. Gratefully, | |
| Comment by virinchi samineni [ 24/Oct/19 ] | |
|
hi @eric.sedor I don't see COUNT_STAGE now either
| |
| Comment by Eric Sedor [ 24/Oct/19 ] | |
|
Thanks virinchisamineni@gmail.com and sorry for the back and forth here. I do see .count() is included in what you ran but it really seems like the count portion of this operation is not reaching explain(). Can you try this form please, and paste the explain results?
| |
| Comment by virinchi samineni [ 24/Oct/19 ] | |
|
Hi @eric.sedor Yes "applicationType" is equivalent to "type" provided in description. I just wanted to convey the problem i was facing. here is screen shot for command I ran.
| |
| Comment by Eric Sedor [ 23/Oct/19 ] | |
|
virinchisamineni@gmail.com: this query looks to be on applicationType instead of type as initially provided in the description, and I'm a bit surprised there is no COUNT stage at all. Can you please provide the exact command you run to produce that explain output? It will help us dispel confusion while interpreting the explain plan. Thanks! | |
| Comment by virinchi samineni [ 23/Oct/19 ] | |
|
Hi @eric.sedor }, , , }, , , }, , , }, , , } , , , }, | |
| Comment by Eric Sedor [ 22/Oct/19 ] | |
|
Hi virinchisamineni@gmail.com, To be clear: Yes, the delay is due to the current implementation of count. And yes, the current workaround for this is to issue separate queries. This does impose the network cost you as you say, but depending on the size of the collection, that could still be faster than issuing one query of $ins, and less disruptive of the working set. That means other operations won't be as impacted by the count operations either, so there could be indirect benefits for other ops. I believe Asya is correct that this is a duplicate of SERVER-17266. But for completeness, can you please provide the output of:
Gratefully, | |
| Comment by Asya Kamsky [ 21/Oct/19 ] | |
|
This appears to be duplicate of SERVER-17266 (and a few related tickets). The issue is that faster COUNT_SCAN can only be used if the index is not multiKey. I've reopened the other ticket and scheduled it for triage, this one should be marked as a duplicate of it. | |
| Comment by virinchi samineni [ 18/Oct/19 ] | |
|
Hi Eric, Thanks for the update. Are you suggesting that rather than using $in operator If that's the case won't I be facing network latency as i am issuing 3 separate queries as opposed to one. I want to understand if there is a possibility of reducing 5 sec to 1~2 sec | |
| Comment by Eric Sedor [ 17/Oct/19 ] | |
|
Hi virinchisamineni@gmail.com, It looks like you are experiencing known behavior with how count is currently able to use indexes. For this specific operation, you should be able to issue queries for each type separately (rather than using $in. Each individual count request should be able to use only the index because each operation would access a "contiguous" range of index keys. That said, I am looking into whether or not a jira ticket exists to change this behavior, and will update you. Thanks in advance for your patience. Sincerely, | |
| Comment by virinchi samineni [ 17/Oct/19 ] | |
|
Hi , |