[SERVER-48975] Increase isSelf logging verbosity Created: 18/Jun/20 Updated: 29/Oct/23 Resolved: 25/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.1, 4.7.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Pavithra Vetriselvan | Assignee: | Pavithra Vetriselvan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||
| Sprint: | Repl 2020-06-29 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 14 | ||||||||||||
| Description |
|
As a part of However, after doing so, we noticed that other suites would also benefit from this logging. Since they are logged under the NETWORK component, we would have to increase the verbosity per suite (sometimes in reaction to seeing a failure we could not diagnose without extra isSelf logging). It could be useful to change these logs to warnings or log them at the default log level. Based on the conversation in BACKPORT-7368, we should investigate the risk of flooding the logs when a node is removed and it retries isSelf on every heartbeat. It could end up being too verbose. |
| Comments |
| Comment by Githook User [ 20/Aug/20 ] |
|
Author: {'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}Message: (cherry picked from commit e87c4aea91ba223d9a3f59dd776c8dd45adb341e) |
| Comment by Githook User [ 25/Jun/20 ] |
|
Author: {'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}Message: |
| Comment by Siyuan Zhou [ 22/Jun/20 ] |
|
Agreed with Judah on the log level. Pavi also made a good point on flooding the log. I think we should fix the bugs when they come up. |
| Comment by Pavithra Vetriselvan [ 18/Jun/20 ] |
|
I think logging at the default level was something we tried originally. We thought this solution would flood the logs and become too verbose if we retry isSelf on every heartbeat in a removed state. I think this can happen since we continue to send heartbeats to removed nodes (covered in SERVER-41031 and related tickets). Another option is to fix that bug. In terms of your second question, it's a little unclear to me when we decide to log a warning vs a debug log line in the isSelf code path. isSelf already logs a warning here when we catch an exception, so I thought the logs we added (to print errors in connectSocketOnly() and authenticateInternalUser()) could also be useful information. That being said, a warning would have the same issue of flooding the logs as logging at the default level. |
| Comment by Judah Schvimer [ 18/Jun/20 ] |
|
Why warning instead of just log level 0? Do we want customers to be alerted to this event? |
| Comment by Pavithra Vetriselvan [ 18/Jun/20 ] |
|
siyuan.zhou Wondering if you had thoughts here! |