[SERVER-65949] Sharded time series leads to high query rates on the primary shard with CommandOnShardedViewNotSupportedOnMongod errors Created: 26/Apr/22 Updated: 26/Dec/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 5.0.7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Marnu vd Merwe | Assignee: | Backlog - Query Integration |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | qi-timeseries, read-only-views, sharding, time-series | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Query Integration
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: | My setup:
Reproduction steps:
The attached image is a mongostat output that was captured by NoSqlBooster 7.1. The left side is a direct connection to the primary of shard1, and the right a direct connection to the primary of shard2 |
||||||||||||
| Participants: | |||||||||||||
| Description |
|
Sharding a time series collection leads to higher throughput on write, but reads affect the whole cluster, because the primary shard has a spike in CPU usage. When reviewing the logs of Mongos, several log entries state that Resolved views on sharded collections must be executed by mongos. When I stop the read load, these messages are no longer logged. From my research, it seems like this can be related to This is a problem for us, because adding a read load affects the whole cluster's performance. Our workload has about 25% reads for every 100% of writes. I found the problem while load testing my sharded time series prototype on Atlas |
| Comments |
| Comment by Matt Panton [ 10/Jan/23 ] | |||
|
To resolve this issue and increase performance for views/time series the sharding catalog must track views which is an objective in the linked ticket PM-1966 (Track Views in Sharding Catalog) | |||
| Comment by Marnu vd Merwe [ 13/Jun/22 ] | |||
|
Hi Chris, Many thanks! I appreciate the feedback! I originally spotted the issue in Atlas while streaming data from a production workload into my PoC Atlas environment. I played around with time series collections to determine how we should provision a cluster for our current workload (10 shards M60 (Mongo 4.4) instances with NVMe storage and with batched, unordered inserts). I stopped scaling my cluster when I noticed that one shard had very high CPU usage, and the others were at about 50%-75% CPU usage of the main shard. The attached screenshots show the CPU utilizations of one of my setups (I think this was a 4 shard, M50 setup on Mongo 5.0.7), with a workload quite near to my the attached test app. During this test, I had an index on SensorId_1_EntryTime_1 (the meta field only had the SensorId inside). I hope this gives the team a bit more to work with. Kind regards, Marnu | |||
| Comment by Chris Kelly [ 13/Jun/22 ] | |||
|
Hi Marnu, Thanks for your patience, and for providing extra context around your report. It seems this issue may be replicating It is possible in this case that these views would trigger the known extra roundtrip issue in SERVER-42282 which is currently in the backlog. As for there being more queries (in general) on the primary shard of the collection, we may need to look a little more. I see the same amount of documents scanned on each shard, even though the total amount of queries sent to the primary shard is much higher. CPU usage seems pretty balanced across your two shards and the mongos, so I'm not 100% sure what specific behavior you're experiencing when it comes to load - there is a minor increase in shard 2 but it's hard to say if this is significant since overall CPU usage is low. I'll forward this to the sharding team and see if they have any input on this. Regards, | |||
| Comment by Marnu vd Merwe [ 03/Jun/22 ] | |||
|
Hi Chris, Thanks for the help thus far. Before we close this ticket; what was your findings regarding reads that stack up on the primary shard? Also, I want to ask that you please consider analyzing my last diagnostic and log report. I do hear what you are saying about shard keys, but I'm confident in my shard key selection; a non-changing GUID that does not monotonically increase, and is very selective, ensuring that one GUID only ever targets one shard. My first report that you analyzed had a problem in its queries, but I fixed this in the second one. Kind regards, Marnu | |||
| Comment by Chris Kelly [ 01/Jun/22 ] | |||
|
Hey Marnu, Thanks for the extra info. Given what we know currently, I'd like to recommend asking our community for help first by posting on the MongoDB Developer Community Forums. That would be a more appropriate area to check out your query and shard key specifically, which should get more eyes on it. If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it further as a possible bug here in the SERVER project. I'll close this ticket for now, but if you get more information on this after checking out the forums, feel free to open another and reference this one so we can get back to it. Regards, | |||
| Comment by Marnu vd Merwe [ 01/Jun/22 ] | |||
|
Hi Chris, Can you help review our code, please? I see you have a keen eye for finding the errors in my queries After your mention about monotonically increasing shard keys, and pointing out my queries are not equally distributed, I did a review of my code, and found that my queries targeted a specific shard with every batch. For attempt 2, I changed the following:
My latest attachement, data - attempt2, includes output of:
With the latest application, it is easier to see the distributed load being generated, and then to observe the high combined read load. Something I saw again this time around is that the primary shard for the collection shows a high number of "query" ops, and that both shards show a number of "command" ops, when combined, equal to the number of "query" ops. My assumption is that every find query is hitting the primary shard for a "get" op, which is then redirected to the correct shard, where the shard does a "command" op to return the data. Regarding a different shard key; my key is a random GUID, and I have no other key to shard on that will not lead to monotonically increasing keys. My document contains:
Do you have a recommendation on a new shard key? This document mimics our production setup where we link sensors to GUIDs. Let me know if you need anything else. Marnu | |||
| Comment by Chris Kelly [ 01/Jun/22 ] | |||
|
Hi Marnu, I'm going to look into this further, but I just wanted to share something that might be a good resource for optimizing guid-based sharding by pre-splitting data. Shard 1: Shard 2: Shard 2 shows lots of query load for me, but it looks like data is being inserted into both shards. Despite this, most of the queries seem to be on shard 2 still. I also saw these in the logs preceding the "Resolved views on sharded collections must be executed by mongos" error.
I'm not sure what to make of this just yet, but just wanted to keep you in the loop. Could you return the value of running "sh.status()" on your mongos after your testing suite is complete? Regards, | |||
| Comment by Chris Kelly [ 31/May/22 ] | |||
|
Just to add some more information, I think this is potentially normal behavior if you're using a monotonically increasing shard key, since a shard key on a value that increases or decreases monotonically is more likely to distribute inserts to a single chunk within the cluster. If you are reading that data quickly after it's inserted, before the chunk is split and migrated to the other shard, it would make sense that you're reading from this one large chunk on only one shard before it has time to be split. In your case, I think it would be worth consulting some information in the docs, specifically with:
I'll also take a look at your submitted logs as well. Christopher | |||
| Comment by Marnu vd Merwe [ 31/May/22 ] | |||
|
Hi Chris, Thanks for the help! Please see attached the logs and diagnostic data, as requested, where sh1 and sh2 are the primaries of each shard. The test was run on MongoDB 5.0.7. Also included is the .NET 6.0 C# project I used to generate read/write load. I hope this helps! Marnu | |||
| Comment by Chris Kelly [ 26/May/22 ] | |||
|
Hello Marnu, Thanks for your patience on this. I just wanted to let you know that we're taking a look at this. Currently I can see the following when I try to reproduce this:
It'd be helpful if you could help provide a little more data to look at. Specifically, it would be very helpful to archive (tar or zip) and upload the following to the ticket:
Regards, |