[SERVER-56183] Prevent LDAP connection pool from stalling the serverStatus output Created: 19/Apr/21  Updated: 29/Oct/23  Resolved: 04/Oct/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.4
Fix Version/s: 5.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Andrey Brindeyev Assignee: Mark Benvenuto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-55316 Disconnect LDAP connections out of line Closed
depends on SERVER-59734 Enforce connection pool timeouts duri... Closed
Related
is related to SERVER-55316 Disconnect LDAP connections out of line Closed
Backwards Compatibility: Fully Compatible
Sprint: Security 2021-05-31, Security 2021-06-14, Security 2021-06-28, Security 2021-07-12, Security 2021-07-26, Security 2021-08-09, Security 2021-08-23, Security 2021-09-06, Security 2021-09-20, Security 2021-10-04, Security 2021-10-18
Participants:
Case:

 Description   

A network-related issue may stall the diagnostic data output in the server due to the serverStatus output delayed, for example:

{"t":{"$date":"2021-04-18T12:42:57.028+00:00"},"s":"I",  "c":"CONNPOOL", "id":22566,   "ctx":"AuthorizationManager-12245","msg":"Ending connection due to bad connection status","attr":{"hostAndPort":"redacted.ldap.host.tld:636","error":"OperationFailed: Operation timed out","numOpenConns":0}
...
{"t":{"$date":"2021-04-18T12:58:17.255+00:00"},"s":"I",  "c":"CONNPOOL", "id":22576,   "ctx":"LDAPConnPool-74125","msg":"Connecting","attr":{"hostAndPort":"redacted.ldap.host.tld:636"}}
{"t":{"$date":"2021-04-18T12:58:17.255+00:00"},"s":"E",  "c":"ACCESS",   "id":24217,   "ctx":"AuthorizationManager-12245","msg":"LDAP authorization failed: {swRoles_getStatus}","attr":{"swRoles_getStatus":{"code":96,"codeName":"Ope
rationFailed","errmsg":"Failed to transform bind user name to LDAP DN :: caused by :: Username mapping operation failed, aborting transformation. { rule: { match: \"(.+)\" ldapQuery: \"OU=Users,OU=REDACTED,DC=redacted,DC=tld??s
ub?(userPrincipalName={0}@redacted.tld)\" } error: \"OperationFailed: Operation timed out\" }, "}}}
{"t":{"$date":"2021-04-18T12:58:17.257+00:00"},"s":"I",  "c":"COMMAND",  "id":20499,   "ctx":"ftdc","msg":"serverStatus was very slow","attr":{"timeStats":{"after basic":0,"after asserts":0,"after connections":0,"after electionMetrics":0,"after encryptionAtRest":0,"after extra_info":0,"after flowControl":0,"after globalLock":0,"after ldapConnPool":919255,"after locks":919255,"after logicalSessionRecordCache":919255,"after mirroredReads":919255,"after network":919255,"after opLatencies":919255,"after opReadConcernCounters":919255,"after opWriteConcernCounters":919255,"after opcounters":919255,"after opcountersRepl":919255,"after oplogTruncation":919255,"after repl":919255,"after security":919255,"after storageEngine":919255,"after tcmalloc":919255,"after trafficRecording":919255,"after transactions":919255,"after transportSecurity":919255,"after twoPhaseCommitCoordinator":919255,"after watchdog":919255,"after wiredTiger":919255,"at end":919257}}}

The resulted FTDC file has a 15-minute gap in metrics.



 Comments   
Comment by Mark Benvenuto [ 04/Oct/21 ]

The fixes for SERVER-59734 and SERVER-55316 fixed this issue.

Generated at Thu Feb 08 05:38:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.