[SERVER-48975] Increase isSelf logging verbosity Created: 18/Jun/20  Updated: 29/Oct/23  Resolved: 25/Jun/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.4.1, 4.7.0

Type: Improvement Priority: Major - P3
Reporter: Pavithra Vetriselvan Assignee: Pavithra Vetriselvan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-41031 After an unreachable node is added an... Open
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.4
Sprint: Repl 2020-06-29
Participants:
Linked BF Score: 14

 Description   

As a part of SERVER-48347, we added debug logging to the isSelf codepath to determine why isSelf could fail.

However, after doing so, we noticed that other suites would also benefit from this logging. Since they are logged under the NETWORK component, we would have to increase the verbosity per suite (sometimes in reaction to seeing a failure we could not diagnose without extra isSelf logging).

It could be useful to change these logs to warnings or log them at the default log level. Based on the conversation in BACKPORT-7368, we should investigate the risk of flooding the logs when a node is removed and it retries isSelf on every heartbeat. It could end up being too verbose.



 Comments   
Comment by Githook User [ 20/Aug/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-48975 increase additional isSelf logging to default log level

(cherry picked from commit e87c4aea91ba223d9a3f59dd776c8dd45adb341e)
Branch: v4.4
https://github.com/mongodb/mongo/commit/85f08829bae59d88a5b88e7aa9babcf16758b450

Comment by Githook User [ 25/Jun/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-48975 increase additional isSelf logging to default log level
Branch: master
https://github.com/mongodb/mongo/commit/e87c4aea91ba223d9a3f59dd776c8dd45adb341e

Comment by Siyuan Zhou [ 22/Jun/20 ]

Agreed with Judah on the log level. Pavi also made a good point on flooding the log. I think we should fix the bugs when they come up.

Comment by Pavithra Vetriselvan [ 18/Jun/20 ]

I think logging at the default level was something we tried originally. We thought this solution would flood the logs and become too verbose if we retry isSelf on every heartbeat in a removed state. I think this can happen since we continue to send heartbeats to removed nodes (covered in SERVER-41031 and related tickets). Another option is to fix that bug.

In terms of your second question, it's a little unclear to me when we decide to log a warning vs a debug log line in the isSelf code path. isSelf already logs a warning here when we catch an exception, so I thought the logs we added (to print errors in connectSocketOnly() and authenticateInternalUser()) could also be useful information.

That being said, a warning would have the same issue of flooding the logs as logging at the default level.

Comment by Judah Schvimer [ 18/Jun/20 ]

Why warning instead of just log level 0? Do we want customers to be alerted to this event?

Comment by Pavithra Vetriselvan [ 18/Jun/20 ]

siyuan.zhou Wondering if you had thoughts here!

Generated at Thu Feb 08 05:18:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.