[SERVER-57232] Increase in slow queries in ConnectionsBuildup case in Mongos on v4.4 Created: 26/May/21 Updated: 29/Mar/22 Resolved: 29/Mar/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Tyler Seip (Inactive) | Assignee: | Daniel Morilha (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | servicearch-q4-2021, servicearch-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Service Arch 2022-1-24, Service Arch 2022-2-07, Service Arch 2022-2-21, Service Arch 2022-03-07, Service Arch 2022-03-21, Service Arch 2022-04-04 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
The comment below, transcribed from the linked ticket, describes a worrying increase in the number of slow queries in the ConnectionsBuildup workload. It should be the case our performance is bottlenecked by networking (ConnectionPools in particular), but the connection stats indicate that our performance is actually bottlenecked elsewhere. In this ticket, investigate what caused this apparent regression and file a ticket to fix it. |
| Comments |
| Comment by Daniel Morilha (Inactive) [ 29/Mar/22 ] | ||||||||||
|
Removing story points and closing as duplicate of | ||||||||||
| Comment by Daniel Morilha (Inactive) [ 21/Mar/22 ] | ||||||||||
|
The proposed fixes for | ||||||||||
| Comment by Daniel Morilha (Inactive) [ 27/Jan/22 ] | ||||||||||
|
not actively working on it as of now. | ||||||||||
| Comment by Daniel Morilha (Inactive) [ 25/Jan/22 ] | ||||||||||
|
Another interesting fact by looking at the size of the generated logs of each mongod entry. Master has an unevenly distributed load to one (assumed) secondary when compared to v4.2:
So perhaps ticket | ||||||||||
| Comment by Daniel Morilha (Inactive) [ 24/Jan/22 ] | ||||||||||
|
I've tried to reproduce by running evergreen sys-perf projects against master (currently pointing to 5.3) and 4.2 where the initial baseline came from. Interestingly, mongos logs for v4.2 were a fraction of the size. I then scanned through the code base to see where the specific "slow query" log line was getting generated and found it in here. By just following the logic I am not certain what makes a query / operation to be considered slow. From the 215,000 log lines durationMillis ranged from 100 to 363. | ||||||||||
| Comment by Daniel Morilha (Inactive) [ 20/Jan/22 ] | ||||||||||
|
In my previous update I forgot to mention there is also | ||||||||||
| Comment by Daniel Morilha (Inactive) [ 20/Jan/22 ] | ||||||||||
|
Met with tyler.seip and george.wangensteen yesterday to discuss everything behind this ticket and in summary there was a dramatic increase in "slow queries" while running the ConnectionsBuildup performance test from version 4.2 to version 4.4. Reasons are inconclusive at this point and but might be related to the introduction of Preferred Initial Sync Source or Hedged Reads First steps into the investigation are to diff the two branches and check if something obvious is noticed and try to operationally reproduce the issue. | ||||||||||
| Comment by Daniel Morilha (Inactive) [ 18/Jan/22 ] | ||||||||||
|
Bringing it to the active sprint per blake.oler's priority suggestion. | ||||||||||
| Comment by Tyler Seip (Inactive) [ 01/Jun/21 ] | ||||||||||
| Comment by Tyler Seip (Inactive) [ 27/May/21 ] | ||||||||||
from Benjamin Caimano |