Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-4603

Investigate more efficient _ALock/_ACondition classes

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Python Drivers
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

      Context

      Our _ALock/_ACondition classes busy loop like this:

      class _ALock:
      ...
          async def a_acquire(self, blocking: bool = True, timeout: float = -1) -> bool:
              if timeout > 0:
                  tstart = time.monotonic()
              while True:
                  acquired = self._lock.acquire(blocking=False)
                  if acquired:
                      return True
                  if timeout > 0 and (time.monotonic() - tstart) > timeout:
                      return False
                  if not blocking:
                      return False
                  await asyncio.sleep(0)
      

      This will use more CPU when the lock is contended. The bench-lock.py script illustrates this problem:

      $ time python bench-lock.py sync
      Python: 3.12.4, PyMongo: 4.9.0.dev0
      python bench-lock.py sync  0.12s user 0.03s system 2% cpu 5.180 total
      $ time python bench-lock.py async
      Python: 3.12.4, PyMongo: 4.9.0.dev0
      python bench-lock.py async  0.73s user 0.41s system 22% cpu 5.135 total
      

      See that the async version uses 22% CPU whereas the sync version uses only 2%.

      Definition of done

      Investigate alternative approaches. Perhaps using loop.run_in_executor:

          async def a_acquire(self, blocking: bool = True, timeout: float = -1) -> bool:
              loop = asyncio.get_event_loop()
              return await loop.run_in_executor(None, self._lock.acquire, blocking, timeout)
      

      Or using asyncio.Lock directly but ensuring thread safety by running all calls on the application's loop via loop. run_coroutine_threadsafe(). This could become tricky if the application uses a MongoClient from multiple loops, eg:

      client = AsyncMongoClient()
      asyncio.run(client.list_database_names)
      asyncio.run(client.list_database_names)
      

      Pitfalls

      Performance of both serial and highly concurrent use cases should be benchmarked.

            Assignee:
            Unassigned Unassigned
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: