-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Labels:None
Description
This ticket allows the server to track, log, and profile accesses to the user cache per operation. When at least one access to the user cache has been made for an operation, the following subdocument will appear in the log for that operation and in the `system.profile` document for the operation:
"authorization":
Documentation should indicate that these statistics will appear as above in the slow operation logs and in the `system.profile` collection. The slow operation logs page here should have its example updated. The database profiler page here should have a section for `system.profile.authorization`. The description for the `authorization` field should indicate that it provides information about user cache accesses during the operation. `authorization.startedUserCacheAcquisitionAttempts` indicates the number of user cache accesses started while `authorization.completedUserCacheAcquisitionAttempts` indicates the number of those accesses that were completed at the time of profiling. `authorization.userCacheWaitTimeMicros` indicates the amount of time spent waiting on the cache, in microseconds.
This information only appears during slow operation logs and database profiling. Currently, it is NOT part of the output of `$currentOp`, so that documentation should not be updated.
Description of Linked Ticket
When the server tries to (re-)authorize sessions using the slow LDAP server, this will stall all new authorizations, including those using the SCRAM mechanism.
To simplify RCA process I propose to include the new section named Authorization, for example:
Before:
2020-02-25T08:07:44.108-0800 I COMMAND [conn141] command admin.system.users appName: "MongoDB Shell" command: isMaster { isMaster: 1.0, saslSupportedMechs: "test.local852", lsid: { id: UUID("f7a3f71c-18c2-400a-8e93-4dceb8a26ccb") }, $db: "test" } numYields:0 reslen:268 locks:{ Global: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } storage:{} protocol:op_msg 11722ms
After:
2020-02-25T08:07:44.108-0800 I COMMAND [conn141] command admin.system.users appName: "MongoDB Shell" command: isMaster { isMaster: 1.0, saslSupportedMechs: "test.local852", lsid: { id: UUID("f7a3f71c-18c2-400a-8e93-4dceb8a26ccb") }, $db: "test" } numYields:0 reslen:268 locks:{ Global: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } }, Authorization: { acquireWaitCount: { r: 1 }, timeAcquiringMicros: { w: 11710012 } } } storage:{} protocol:op_msg 11722ms
Right now it's not obvious that a SCRAM authorization stalled due to the exclusive lock which was held by another session's authorization cache refresh (reaching out to the LDAP server and waiting for a response) - unless server has debug logging enabled.
Scope of changes
Impact to Other Docs
MVP (Work and Date)
Resources (Scope or Design Docs, Invision, etc.)
- documents
-
SERVER-46412 Log time spent waiting for an authorization lock in the locks section
- Closed