Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-12670

Fix concurrencytest to have a single wait thread

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT11.3.0, 8.0.0-rc0
    • Affects Version/s: None
    • Component/s: Test Python
    • None
    • Storage Engines

      In test/3rdparty/concurrencytest-0.1.2-locally-modified/concurrencytest.py, we have a locally modified version of a python package. This package allows us to split a test run among multiple processes, that is, it implements the j option for test/run.py. We added some code in WT-8932 to wait for subprocesses, so that we don't have "defunct" processes hanging around.

      But it looks like there's an error in the change, that causes a wait thread to be launched for each process. Example with 3 child processes:

      • child process 100 is created, wait thread created to wait for 100
      • child process 101 is created, wait thread created to wait for 100,101
      • child process 102 is created, wait thread created to wait for 100,101,102

      Clearly, we only want to create one thread. I believe the error are these lines:

          # Monitor our children to prevent leaving <defunct> processes around.
          wait_thread = Thread(target = wait_for_children, args = (pids, ))
          wait_thread.start()

      which are in the process/fork loop, and should be after it.

      Normally, the way the wait loop is coded, it shouldn't make a difference, however it's unusual to have potentially lots of threads calling wait on the same sets of processes. When you do unusual things, you never know if other bugs (say at the OS layer) might be exposed.

            Assignee:
            donald.anderson@mongodb.com Donald Anderson
            Reporter:
            donald.anderson@mongodb.com Donald Anderson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: