Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-4414

interruptInUseConnections should cancel pending connections too

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Minor - P4 Minor - P4
    • 4.11
    • Affects Version/s: None
    • Component/s: None
    • Python Drivers
    • Not Needed
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

      There is a minor bug in the implementation of interruptInUseConnections added in PYTHON-3175. If the pool is cleared simultaneously while a thread is creating a connection, the pending connection is not cancelled or closed. Instead the connection is created normally and added to the pool as shown in the log:

      Events:
       [2024/04/29 03:44:47.090] PoolCreatedEvent(('localhost', 27017), {})
       [2024/04/29 03:44:47.090] PoolReadyEvent(('localhost', 27017))
       [2024/04/29 03:44:47.090] ConnectionCheckOutStartedEvent(('localhost', 27017))
       [2024/04/29 03:44:47.090] ConnectionCreatedEvent(('localhost', 27017), 1)
       [2024/04/29 03:44:47.090] PoolClearedEvent(('localhost', 27017), None, True)
       [2024/04/29 03:44:47.090] ConnectionReadyEvent(('localhost', 27017), 1, 10.015425856999968)
       [2024/04/29 03:44:47.090] ConnectionCheckedOutEvent(('localhost', 27017), 1, 10.018921869999986)
      

      The fix is to ensure that pending connections are interrupted as well.

      Original description:

      Name of Failure:

      test.test_connection_monitoring.TestCMAP.test_connection_monitoring_pool_clear_interrupting_pending_connections_clear_with_interruptInUseConnections___true_closes_pending_connections

      Link to task:

      https://spruce.mongodb.com/task/mongo_python_driver_tests_python_version_rhel8_test_ssl__platform~rhel8_auth_ssl~auth_ssl_python_version~pypy3.10_coverage~coverage_test_5.0_sharded_cluster_21461ace4fe6ed1eb26626d5c84ebb77202faccd_24_04_29_10_19_28/tests?execution=0&sortBy=STATUS&sortDir=ASC

      Context of when and why the failure occurred:

      It has been failing intermittently since PYTHON-2048 was merged.

      Stack trace:

       [2024/04/29 05:44:57.343] FAILURE: AssertionError: Didn't ever find 1 <class 'pymongo.monitoring.ConnectionCheckOutFailedEvent'> event(s) ()
       [2024/04/29 05:44:57.343] self = <test.test_connection_monitoring.TestCMAP testMethod=test_connection_monitoring_pool_clear_interrupting_pending_connections_clear_with_interruptInUseConnections_=_true_closes_pending_connections>
       [2024/04/29 05:44:57.343]     def run_scenario(self):
       [2024/04/29 05:44:57.343] >       self.run_scenario(scenario_def, test)
       [2024/04/29 05:44:57.343] test/test_connection_monitoring.py:460: 
       [2024/04/29 05:44:57.343] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
       [2024/04/29 05:44:57.343] test/test_connection_monitoring.py:289: in run_scenario
       [2024/04/29 05:44:57.343]     self.run_operations(test["operations"])
       [2024/04/29 05:44:57.343] test/test_connection_monitoring.py:170: in run_operations
       [2024/04/29 05:44:57.343]     self.run_operation(op)
       [2024/04/29 05:44:57.343] test/test_connection_monitoring.py:164: in run_operation
       [2024/04/29 05:44:57.343]     meth(op)
       [2024/04/29 05:44:57.343] test/test_connection_monitoring.py:118: in wait_for_event
       [2024/04/29 05:44:57.343]     wait_until(
       [2024/04/29 05:44:57.343] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
       [2024/04/29 05:44:57.343] predicate = <function TestCMAP.wait_for_event.<locals>.<lambda> at 0x000000000e741d80>
       [2024/04/29 05:44:57.343] success_description = "find 1 <class 'pymongo.monitoring.ConnectionCheckOutFailedEvent'> event(s)"
       [2024/04/29 05:44:57.343] timeout = 10.0
       [2024/04/29 05:44:57.343]     def wait_until(predicate, success_description, timeout=10):
       [2024/04/29 05:44:57.343]         """Wait up to 10 seconds (by default) for predicate to be true.
       [2024/04/29 05:44:57.343]     
       [2024/04/29 05:44:57.343]         E.g.:
       [2024/04/29 05:44:57.343]     
       [2024/04/29 05:44:57.343]             wait_until(lambda: client.primary == ('a', 1),
       [2024/04/29 05:44:57.343]                        'connect to the primary')
       [2024/04/29 05:44:57.343]     
       [2024/04/29 05:44:57.343]         If the lambda-expression isn't true after 10 seconds, we raise
       [2024/04/29 05:44:57.343]         AssertionError("Didn't ever connect to the primary").
       [2024/04/29 05:44:57.343]     
       [2024/04/29 05:44:57.343]         Returns the predicate's first true value.
       [2024/04/29 05:44:57.343]         """
       [2024/04/29 05:44:57.343]         start = time.time()
       [2024/04/29 05:44:57.343]         interval = min(float(timeout) / 100, 0.1)
       [2024/04/29 05:44:57.343]         while True:
       [2024/04/29 05:44:57.343]             retval = predicate()
       [2024/04/29 05:44:57.343]             if retval:
       [2024/04/29 05:44:57.343]                 return retval
       [2024/04/29 05:44:57.343]     
       [2024/04/29 05:44:57.343]             if time.time() - start > timeout:
       [2024/04/29 05:44:57.343] >               raise AssertionError("Didn't ever %s" % success_description)
       [2024/04/29 05:44:57.343] E               AssertionError: Didn't ever find 1 <class 'pymongo.monitoring.ConnectionCheckOutFailedEvent'> event(s)
       [2024/04/29 05:44:57.343] test/utils.py:819: AssertionError
      

      Here's the stdout log of the failed test:

       [2024/04/29 03:44:47.090] ----------------------------- Captured stdout call -----------------------------
       [2024/04/29 03:44:47.090] Failed test: 'clear with interruptInUseConnections = true closes pending connections'
       [2024/04/29 03:44:47.090] Operations:
       [2024/04/29 03:44:47.090] {'name': 'ready'}
       [2024/04/29 03:44:47.090] {'name': 'start', 'target': 'thread1'}
       [2024/04/29 03:44:47.090] {'name': 'checkOut', 'thread': 'thread1'}
       [2024/04/29 03:44:47.090] {'name': 'waitForEvent', 'event': 'ConnectionCreated', 'count': 1}
       [2024/04/29 03:44:47.090] {'name': 'clear', 'interruptInUseConnections': True}
       [2024/04/29 03:44:47.090] {'name': 'waitForEvent', 'event': 'ConnectionCheckOutFailed', 'count': 1}
       [2024/04/29 03:44:47.090] Threads:
       [2024/04/29 03:44:47.090] {'thread1': <SpecRunnerThread(thread1, started daemon 139897662293760)>}
       [2024/04/29 03:44:47.090] Connections:
       [2024/04/29 03:44:47.090] {}
       [2024/04/29 03:44:47.090] Events:
       [2024/04/29 03:44:47.090] PoolCreatedEvent(('localhost', 27017), {})
       [2024/04/29 03:44:47.090] PoolReadyEvent(('localhost', 27017))
       [2024/04/29 03:44:47.090] ConnectionCheckOutStartedEvent(('localhost', 27017))
       [2024/04/29 03:44:47.090] ConnectionCreatedEvent(('localhost', 27017), 1)
       [2024/04/29 03:44:47.090] PoolClearedEvent(('localhost', 27017), None, True)
       [2024/04/29 03:44:47.090] ConnectionReadyEvent(('localhost', 27017), 1, 10.015425856999968)
       [2024/04/29 03:44:47.090] ConnectionCheckedOutEvent(('localhost', 27017), 1, 10.018921869999986)
       [2024/04/29 03:44:47.090] Log:
       [2024/04/29 03:44:47.090] ================================== XFAILURES ==============================
      

            Assignee:
            shane.harvey@mongodb.com Shane Harvey
            Reporter:
            steve.silvester@mongodb.com Steve Silvester
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: