Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-15652

[SERVER] Investigate changes in SERVER-64967: Measure how long it takes operations using egress connections to write to network

      Original Downstream Change Summary

      This adds a new serverStatus metric that is guarded under the featureFlagConnHealthMetrics feature flag:

      'metrics.network. totalTimeForEgressConnectionAcquiredToWireMicros':

      • The time taken for an operation to acquire an egress connection and complete writing to the wire. This metric is cumulative across all such connections since the last server restart.

      Also a new server parameter was added: 'connectionAcquisitionToWireLoggingRate':
      description: >-
      The rate at which egress connection metrics below a certain time threshold will be logged at
      info level. This only applies for the 'network.totalConnectionAcquiredToWireMillis'
      server status metric.

      This server parameter is settable at startup and on runtime. The default value is 0.2. Values must be between 0.0 and 1.0.

      Description of Linked Ticket

      We sometimes see operations get bottlenecked at the point in their lifecycle where they need to perform RPC. In SERVER-64964, SERVER-64965, and SERVER-63261, we're adding metrics to measure how long connection establishment takes, how many operations fail while waiting to acquire connections, and how long operations spend waiting to acquire connections.

      For completeness, let's also measure the amount of time operations spend doing networking after they acquire a connection. Add a measurement of how much wall-time passes between an operation acquiring a connection and it completes writing to the wire. 

            Assignee:
            jason.price@mongodb.com Jason Price
            Reporter:
            backlog-server-pm Backlog - Core Eng Program Management Team
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved:
              1 year, 29 weeks, 1 day ago