Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-2280

Investigate slow OCSP endpoint checks on Windows

    • Type: Icon: Task Task
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: TLS/SSL
    • Labels:
      None

      During PYTHON-2144 I noticed that the OCSP endpoint checks for our test certs are very, very slow on Windows (1000ms for a successful lookup and 2000ms for a failed lookup). The first connection’s OCSP callback takes 3000ms (2 seconds for the failed request and 1 second for the successful one):

      2020-06-09 16:45:08,009 DEBUG ocsp_support Peer did not staple an OCSP response
      2020-06-09 16:45:08,009 DEBUG ocsp_support Requesting OCSP data
      2020-06-09 16:45:08,009 DEBUG ocsp_support Trying http://localhost:9001/power/level
      2020-06-09 16:45:08,013 DEBUG connectionpool Starting new HTTP connection (1): localhost:9001
      2020-06-09 16:45:10,014 DEBUG ocsp_support HTTP request failed: HTTPConnectionPool(host='localhost', port=9001): Max retries exceeded with url: /power/level (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000000003F916D8>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it',))
      2020-06-09 16:45:10,014 DEBUG ocsp_support Trying http://localhost:8100/status
      2020-06-09 16:45:10,016 DEBUG connectionpool Starting new HTTP connection (1): localhost:8100
      INFO:werkzeug:127.0.0.1 - - [09/Jun/2020 16:45:11] "POST /status HTTP/1.1" 200 -
      2020-06-09 16:45:11,027 DEBUG connectionpool http://localhost:8100 "POST /status HTTP/1.1" 200 1511
      2020-06-09 16:45:11,028 DEBUG ocsp_support OCSP response status: <OCSPResponseStatus.SUCCESSFUL: 0>
      2020-06-09 16:45:11,028 DEBUG ocsp_support Verifying response
      2020-06-09 16:45:11,028 DEBUG ocsp_support Responder is a delegate
      2020-06-09 16:45:11,029 DEBUG ocsp_support Using key hash
      2020-06-09 16:45:11,033 DEBUG ocsp_support Caching OCSP response.
      2020-06-09 16:45:11,033 DEBUG ocsp_support OCSP cert status: <OCSPCertStatus.GOOD: 0>
      

      The second connection’s OCSP callback uses the cached response and takes 1ms:

      2020-06-09 16:45:11,063 DEBUG ocsp_support Peer did not staple an OCSP response
      2020-06-09 16:45:11,063 DEBUG ocsp_support Requesting OCSP data
      2020-06-09 16:45:11,063 DEBUG ocsp_support Trying http://localhost:9001/power/level
      2020-06-09 16:45:11,064 DEBUG ocsp_support Using cached OCSP response.
      2020-06-09 16:45:11,064 DEBUG ocsp_support OCSP cert status: <OCSPCertStatus.GOOD: 0>
      

      I bet there are http settings we could play with to make the 2 second failure happen faster. I'm also puzzled why the "http://localhost:9001/power/level
      2" request doesn’t fail fast since nothing is running on port 9001. In any case, the issue is alleviated by the client's OCSP cache (as mentioned above the second connection only takes 1ms). Future improvements could be to:

      • Profile the OCSP callback to find the 2 second and 1 second bottleneck.
      • Run multiple endpoint checks in parallel. This way the latency is limited to the fastest successful request.
      • Adjust the requests.post settings.

            Assignee:
            Unassigned Unassigned
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: