[SERVER-53197] Awaitable hello/isMaster appears in slow query log, part 2 Created: 02/Dec/20 Updated: 29/Oct/23 Resolved: 08/Dec/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.4.0 |
| Fix Version/s: | 4.9.0, 4.4.3 |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
I didn't really fix However, users can and do override this default by setting their drivers' heartbeatFrequencyMS to a longer duration; if this is 20 seconds or more then ismasters make noise in the slow query log again. |
| Comments |
| Comment by Jack Weir [ 14/Jan/21 ] |
|
The "Slowest Operations" panel in the Real Time Performance Panel is populated by the output of running currentOp, not from the slow query logs. The noisy isMaster's in the currentOp output is a known issue being addressed on the server side by SERVER-51358 |
| Comment by Jussi Kauppila [ 13/Jan/21 ] |
|
We are still seeing this in Mongo Atlas which runs version 4.4.3 now.
Here one from cut&pasted from our Atlas realtime view:
|
| Comment by Githook User [ 12/Dec/20 ] |
|
Author: {'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}Message: |
| Comment by Bruce Lucas (Inactive) [ 10/Dec/20 ] |
|
Yes, I think that's fine, thanks. |
| Comment by A. Jesse Jiryu Davis [ 09/Dec/20 ] |
|
bruce.lucas another question - we intend to backport this fix to the next 4.4 release. The main point is to silence noisy logs, but we will also make the latency stats change. Does that sound okay to you? |
| Comment by Githook User [ 08/Dec/20 ] |
|
Author: {'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}Message: |
| Comment by Bruce Lucas (Inactive) [ 08/Dec/20 ] |
|
jesse the consensus in our meeting is that your proposed change looks good. Thanks for asking. |
| Comment by A. Jesse Jiryu Davis [ 07/Dec/20 ] |
|
Bruce, that's all correct. I'm curious what the perf team thinks. Depending on the answer, I may make the proposed change (include awaitable isMaster minus sleep time in latency metrics) or just ban awaitable isMasters from latency metrics entirely. |
| Comment by Bruce Lucas (Inactive) [ 07/Dec/20 ] |
|
The frequency of the awaitable isMaster commands will generally be fairly low, once every 10 seconds per connection, correct? So I think the effect on the latency metrics between 4.4 and 5.0 will be small, maybe unless there are a very large number of connections. On the other hand, I'm not sure how much value there is in including the awaitable isMaster (non-sleep) time in the latency metrics, and any change between versions, even if small, will make it more difficult to compare performance between versions. However maybe the latency of commands (which are reported separately from reads and writes) is not so important in this regard. To summarize:
Assuming that's correct, I'll run this by our perf team. |
| Comment by A. Jesse Jiryu Davis [ 04/Dec/20 ] |
|
bruce.lucas, as part of the fix for this ticket, I'm considering adding awaitable isMaster back to latency metrics, changing our approach from
I expect we'd see latency metrics fall between 4.4 and 5.0 if we make this change, since awaitable isMaster minus its sleep time is very fast, and it will now be included in the overall metrics. Do you think it's better to make this change (include awaitable isMaster minus sleep time in latency metrics), or better to preserve the status quo (awaitable isMaster is excluded from latency metrics)? |
| Comment by Max Hirschhorn [ 02/Dec/20 ] |
|
jesse, have we considered using CurOp::pauseTimer() like we do for getMore on tailable+awaitData cursors to not track the time spent waiting rather than setting a slowMsOverride? |