Uploaded image for project: 'Drivers'
  1. Drivers
  2. DRIVERS-2552

Remove average and 90th percentile RTT times from server description

    • Type: Icon: Spec Change Spec Change
    • Resolution: Won't Do
    • Priority: Icon: Unknown Unknown
    • None
    • Component/s: SDAM
    • Labels:
      None
    • Needed

      Summary

      Most fields on a server description as described in the ServerDescription section of the SDAM spec come directly from the most recent connection handshake. However roundTripTime and ninetiethPercentileRoundTripTime don't come from the handshake information.

      The Measuring RTT section of the SDAM spec describes that drivers must set the roundTripTime and ninetiethPercentileRoundTripTime from the current average and 90th percentile RTTs for that server, respectively, as measured by the RTT monitor:

      When constructing a ServerDescription from a streaming hello or legacy hello response, clients MUST use average and 90th percentile round trip times from the RTT task.

      The problem with that approach is that setting an arbitrary snapshot of a constantly-updated value can lead to confusion and bugs when developers need up-to-date RTT data. Since the SDAM spec doesn't describe whether the roundTripTime and ninetiethPercentileRoundTripTime values on a server description should or must be consistently updated, driver devs who are unfamiliar with the sources of RTT data likely have no idea whether that RTT data on the server description is up-to-date or not. That may lead driver devs on a potentially time-consuming investigation into the source of and all references to that data, possibly leading to the conclusion that the data is stale and not useful. Worse, a driver dev may come to the wrong conclusion and implement a feature that uses stale RTT information, potentially leading to difficult-to-diagnose bugs.

      Considering many drivers do not keep the RTT data on server descriptions up-to-date, and stale RTT data is more confusing than useful, we should remove the roundTripTime and ninetiethPercentileRoundTripTime fields from the server description in the SDAM spec. Instead, all drivers should fetch up-to-date RTT information from the RTT monitor as described in the RTT thread section of SDAM.

      Motivation

      Who is the affected end user?

      Drivers developers.

      How does this affect the end user?

      They are confused by stale RTT data on server descriptions.

      How likely is it that this problem or use case will occur?

      Devs who are new to developing features that depend on RTT data likely do not understand if/when different server description fields are updated. The SDAM specification isn't clear about if/when the roundTripTime/ninetiethPercentileRoundTripTime fields on server descriptions should be updated, so the actual behavior may be inconsistent between drivers.

      If the problem does occur, what are the consequences and how severe are they?

      Devs can spend significant time tracing code and troubleshooting SDAM behaviors trying to determine if/when the roundTripTime/ninetiethPercentileRoundTripTime fields on a server description are updated. This can be wasted time if the dev realizes the RTT values on the server description are not consistently updated because they need up-to-date RTT data. Worse, if the dev comes to the wrong conclusion and uses stale RTT values from a server description, they may implement a feature with difficult-to-diagnose bugs.

      Is this issue urgent?

      No.

      Is this ticket required by a downstream team?

      No.

      Is this ticket only for tests?

      No.

            Assignee:
            shane.harvey@mongodb.com Shane Harvey
            Reporter:
            matt.dale@mongodb.com Matt Dale
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: