[SERVER-73456] Evaluate feasibility of rate limiting mutex Created: 30/Jan/23  Updated: 14/Sep/23  Resolved: 30/Aug/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Joshua Lapacik (Inactive) Assignee: Backlog - Query Integration
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
duplicates SERVER-80006 Investigate best lock for holding the... Closed
Related
related to SERVER-77001 Improve implementation of RateLimiting Closed
Assigned Teams:
Query Integration
Sprint: QO 2023-06-26
Participants:

 Description   

Currently all queries contend for the same rate limiting mutex. We will need to run some performance evaluations to determine if this is feasible or if we need to take alternative approach.



 Comments   
Comment by Charlie Swanson [ 30/Aug/23 ]

After investigating this and merging SERVER-77001, the only remaining task here is already tracked by SERVER-80006. Closing this ticket as a duplicate of that one.

Comment by Davis Haupt (Inactive) [ 26/Jun/23 ]

Some conclusions from initial investigations:

  1. highest ops/sec without query stats enabled at all is between 4 and 8 threads
  2. No query stats at all is about 500 ops/second faster than any rate limited option.
  3. There’s certainly noise but it seems as if rate limits of 1 and 100 were faster than higher rate limits, which is a good sign: the cost of recording telemetry is greater than the cost of contention on the rate limit mutex
  4. There’s some evidence that higher rate limit values are more costly than the rate limit of -1.
Comment by Charlie Swanson [ 16/Jun/23 ]

Got some perf dashboards up related to this topic. Left to explore: Can we improve it at all, and how close to the full "capacity" of ops/sec can we get before we have a notable impact?

davis.haupt@mongodb.com I'm going to assign this to you for now as the person who might look at this more while I'm out next week. I'll also assign you SERVER-77001 since I think the two kinda go hand-in-hand, but there's no need to go tackle SERVER-77001 anytime soon if the perf results look pretty decent (as they kinda are thus far? with rate limiting on) and there are other areas to experiment.

Comment by Charlie Swanson [ 26/Apr/23 ]

Now that  PERF-3974 has landed we should be able to see this in the dashboards. The answer might not be obvious until we've seen it run a couple times though.

Generated at Thu Feb 08 06:24:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.