Uploaded image for project: 'Node.js Driver'
  1. Node.js Driver
  2. NODE-6324

Driver performance monitoring improvements

    • Perf Monitoring Improvements
    • Node Drivers
    • N/A
    • To Do
    • 2
    • 4
    • 4.5
    • 100
    • None
    • Hide

      Engineer(s): Warren James
      2025-02-28: Target date set to 2025-03-07

      Known risks or blockers:

      • Unexpected OOO time and unrelated critical bug caused delays in finalizing this work

      Completed over the last 2 weeks:

      • Finalized performance contexts
      • Completed benchmark tagging and set up relative metric tracking for the driver
      • Made a test performance dashboard
      • Draft PR to create dashboards, just waiting on bson changes to wrap review

      Focus over the next 2 weeks:

      • Finish review of adding tags and relative measurements to js-bson
      • Land PR to create driver and bson performance dashboards (End of project)

      Engineer(s): Warren James, Neal Beeken
      2025-02-14: Target date set to 2025-02-21

      Known risks or blockers:

      • Unrelated CI problems caused delays

      Completed over the last 2 weeks:

      • Fixed bug in benchmark runner that caused tests to run for fewer iterations than intended by spec
      • Added a CPU baseline benchmark that can be used to normalize the results of other benchmarks in order to reduce false alert frequency due to environment noise
      • Decided on a filtered list of perf benchmarks and set up alerting contexts for driver and bson
      • Added tagging capability to benchmark runners in driver and bson

      Focus over the next 2 weeks:

      • Finish adding normalized measurements and context tags to driver and bson benchmarks (driver benchmarks are normalized to "ping" result, bson benchmarks and the ping benchmark are normalized to the cpu baseline benchmark that computes primes)
      • Set up performance dashboards
      • Update slack alerts to use the new contexts

      Show
      Engineer(s): Warren James 2025-02-28: Target date set to 2025-03-07 Known risks or blockers: Unexpected OOO time and unrelated critical bug caused delays in finalizing this work Completed over the last 2 weeks: Finalized performance contexts Completed benchmark tagging and set up relative metric tracking for the driver Made a test performance dashboard Draft PR to create dashboards, just waiting on bson changes to wrap review Focus over the next 2 weeks: Finish review of adding tags and relative measurements to js-bson Land PR to create driver and bson performance dashboards (End of project) Engineer(s): Warren James, Neal Beeken 2025-02-14: Target date set to 2025-02-21 Known risks or blockers: Unrelated CI problems caused delays Completed over the last 2 weeks: Fixed bug in benchmark runner that caused tests to run for fewer iterations than intended by spec Added a CPU baseline benchmark that can be used to normalize the results of other benchmarks in order to reduce false alert frequency due to environment noise Decided on a filtered list of perf benchmarks and set up alerting contexts for driver and bson Added tagging capability to benchmark runners in driver and bson Focus over the next 2 weeks: Finish adding normalized measurements and context tags to driver and bson benchmarks (driver benchmarks are normalized to "ping" result, bson benchmarks and the ping benchmark are normalized to the cpu baseline benchmark that computes primes) Set up performance dashboards Update slack alerts to use the new contexts
    • Not Needed
    • None
    • None
    • None
    • None
    • None
    • None

      Use Case

      As a Node.js engineer
      I want an effective way to monitor the driver's performance
      So that I can keep up with any regressions that may occur and not surprise customers with perf regressions.

      User Experience

      • None

      Dependencies

      Risks/Unknowns

      • How stable has the empty benchmark been?

      Acceptance Criteria

      Implementation Requirements

      • Improve benchmark stability in driver:
        • Use the empty benchmark as a factor against the ping command
        • Use the ping command as a factor against all other commands
        • Use the metrics relative to the above to alert.
      • Start using release dashboards for the driver
        • Figure out appropriate contexts to use and agree as a team on the metrics that should be included there
      • Start using release dashboards in bson
        • Figure out appropriate contexts to use and agree as a team on the metrics that should be included there
      • Do something about the triplicate alerts for differing timeoutMS settings
      • Ensure driver and bson task names are stable moving forward (i.e., do not include server or runtime versions in the title)
        • While we're here, update the hello task size (we missed this during the legacy hello updates)
      • Establish an updated process for the use of the release dashboards and/or slack alerts to monitor driver performance and act on regressions

      Testing Requirements

      • To test the effectiveness of the empty benchmark, add a synchronous loop to 1mil before writing to the socket in the Connection class. Ensure that the benchmarks reflect the cost of this for ping while it is factored out of other commands.

      Documentation Requirements

      • None

      Follow Up Requirements

      • None

            Assignee:
            neal.beeken@mongodb.com Neal Beeken
            Reporter:
            neal.beeken@mongodb.com Neal Beeken
            Daria Pardue Daria Pardue
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              4 weeks, 4 days
              None
              None
              None
              None
              None