[SERVER-47222] Mongos high cpu usage on getShardIdsForRange while dealing shard key range query Created: 01/Apr/20  Updated: 14/Apr/20  Resolved: 10/Apr/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.2.5, 4.0.17
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Cen Zheng Assignee: Blake Oler
Resolution: Duplicate Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-46487 The mongos routing for scatter/gather... Closed
is duplicated by SERVER-47221 Mongos high cpu Closed
Operating System: ALL
Sprint: Sharding 2020-04-20
Participants:
Case:

 Description   

Hi,

We have encountered an issue that after upgrading mongo from 3.4 to 4.0(through 3.6), the mongos's cpu usage has raised for multiple times. After some investigation we found the mainly cpu cost is doing the ShardId string compare from the getShardIdsForRange() call while inserting ShardId into the result set.  The case was that user's query range for the shard key was [MinKey, MaxKey](e.g., doing a range query on a hashed shard key), and the collection's routing table(chunk map) was very large(about 100k chunks). So there will be tens of thousands of inserting into the ShardId result set, costing a lot of cpus. And I notice that in 3.4, there was a ChunkRangeMap that maintains each shard's chunk ranges which can optimize this procedure. I noticed that SERVER-33929 has removed this ChunkRangeMap due to some reason but bringing this performance degradation.  To resolve this issue, I think we can have a fast path for getShardIdsForRange() when the range(all shard key fields) is [MinKey, MaxKey], we only need to return all ShardIds through getAllShardIds(). Looking forward for your feedback. Thanks!



 Comments   
Comment by Cen Zheng [ 11/Apr/20 ]

Hi, Blake,

Got it, thanks!

Comment by Blake Oler [ 10/Apr/20 ]

Hi mingyan.zc@gmail.com, this issue has already been observed in a separate HELP ticket, and we have a fix scheduled to complete in the upcoming quarter. Closing this as a duplicate of the scheduled fix.

Comment by Cen Zheng [ 07/Apr/20 ]

Hi Carl,

Thanks, this mongos is not the only process running on the host. But I think this is irrelevant.

Comment by Carl Champain (Inactive) [ 06/Apr/20 ]

mingyan.zc@gmail.com,

Can you please confirm whether this mongos is the only process running on the host?

We're passing this ticket along to the appropriate team for additional investigation. Updates will be posted on this ticket as they happen.

Thank you,
Carl
 

Comment by Cen Zheng [ 04/Apr/20 ]

Hi Carl,

I have uploaded the files needed, please check!

Thanks.

Comment by Carl Champain (Inactive) [ 03/Apr/20 ]

Hi mingyan.zc@gmail.com,

Thank you for the report.
To help us investigate this issue, can you please:

  • Archive (tar or zip) the $dbpath/diagnostic.data directory (the contents are described here).
  • Provide the logs covering this behavior.

I've created a secure upload portal for you. Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Kind regards,
Carl

Generated at Thu Feb 08 05:13:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.