Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-14398

[SERVER] Log time spent waiting for an authorization lock in the locks section

      Description

      Downstream Change Summary

      This ticket allows the server to track, log, and profile accesses to the user cache per operation. When at least one access to the user cache has been made for an operation, the following subdocument will appear in the log for that operation and in the `system.profile` document for the operation:
      "authorization":

      Unknown macro: { "startedUserCacheAcquisitionAttempts"}

      Documentation should indicate that these statistics will appear as above in the slow operation logs and in the `system.profile` collection. The slow operation logs page here should have its example updated. The database profiler page here should have a section for `system.profile.authorization`. The description for the `authorization` field should indicate that it provides information about user cache accesses during the operation. `authorization.startedUserCacheAcquisitionAttempts` indicates the number of user cache accesses started while `authorization.completedUserCacheAcquisitionAttempts` indicates the number of those accesses that were completed at the time of profiling. `authorization.userCacheWaitTimeMicros` indicates the amount of time spent waiting on the cache, in microseconds.
      This information only appears during slow operation logs and database profiling. Currently, it is NOT part of the output of `$currentOp`, so that documentation should not be updated.

      Description of Linked Ticket

      When the server tries to (re-)authorize sessions using the slow LDAP server, this will stall all new authorizations, including those using the SCRAM mechanism.

      To simplify RCA process I propose to include the new section named Authorization, for example:

      Before:

      2020-02-25T08:07:44.108-0800 I COMMAND  [conn141] command admin.system.users appName: "MongoDB Shell" command: isMaster { isMaster: 1.0, saslSupportedMechs: "test.local852", lsid: { id: UUID("f7a3f71c-18c2-400a-8e93-4dceb8a26ccb") }, $db: "test" } numYields:0 reslen:268 locks:{ Global: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } storage:{} protocol:op_msg 11722ms
      

      After:

      2020-02-25T08:07:44.108-0800 I COMMAND  [conn141] command admin.system.users appName: "MongoDB Shell" command: isMaster { isMaster: 1.0, saslSupportedMechs: "test.local852", lsid: { id: UUID("f7a3f71c-18c2-400a-8e93-4dceb8a26ccb") }, $db: "test" } numYields:0 reslen:268 locks:{ Global: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } }, Authorization: { acquireWaitCount: { r: 1 }, timeAcquiringMicros: { w: 11710012 } } } storage:{} protocol:op_msg 11722ms
      

      Right now it's not obvious that a SCRAM authorization stalled due to the exclusive lock which was held by another session's authorization cache refresh (reaching out to the LDAP server and waiting for a response) - unless server has debug logging enabled.

      Scope of changes

      Impact to Other Docs

      MVP (Work and Date)

      Resources (Scope or Design Docs, Invision, etc.)

            Assignee:
            dave.cuthbert@mongodb.com Dave Cuthbert (Inactive)
            Reporter:
            backlog-server-pm Backlog - Core Eng Program Management Team
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved:
              2 years, 9 weeks, 1 day ago