Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22219

Use the subprocess32 package on POSIX systems in resmoke.py if it's available

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Backport Completed:
    • Steps To Reproduce:
      Hide

      The "invalid free()" issue has only been observed on the Amazon Linux and SUSE 12 build variants. The following script aggressively spawns subprocesses and start threads to asynchronously consume their output exactly as is done in resmoke.py (just without running any actual tests!)

      It fails ~10% of the time on the Amazon Linux build variant and ~20% of the time on SUSE 12.

      #!/usr/bin/env python
       
      from __future__ import absolute_import
       
      import logging
      import os.path
      import sys
      import threading
       
      # Get relative imports to work when the package is not installed on the PYTHONPATH.
      if __name__ == "__main__" and __package__ is None:
          sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
          from buildscripts import resmokelib
       
       
      # Copied from Python 2.7.
      class NullHandler(logging.Handler):
          """
          This handler does nothing. It's intended to be used to avoid the
          "No handlers could be found for logger XXX" one-off warning. This is
          important for library code, which may contain code to log events. If a user
          of the library does not configure logging, the one-off warning might be
          produced; to avoid this, the library developer simply needs to instantiate
          a NullHandler and add it to the top-level logger of the library module or
          package.
          """
          def handle(self, record):
              pass
       
          def emit(self, record):
              pass
       
          def createLock(self):
              self.lock = None
       
       
      class Worker(object):
       
          ITERATIONS = 10000
       
          def __init__(self, worker_num, failure_flag):
              self.name = "worker%d" % (worker_num)
              self.failure_flag = failure_flag
       
              exec_logger = resmokelib.logging.loggers.EXECUTOR
              self.logger = resmokelib.logging.loggers.new_logger(self.name, parent=exec_logger)
       
          def __call__(self):
              i = 0
              while not self.failure_flag.is_set():
                  if i >= Worker.ITERATIONS:
                      break
       
                  # Suppress output of the subprocess because it isn't that interesting; we only care if
                  # an invalid free() occurs.
                  process_logger = resmokelib.logging.loggers.new_logger(self.name)
                  process_logger.addHandler(NullHandler())
       
                  process = resmokelib.core.process.Process(process_logger, args=["echo", "iter%d" % (i)])
                  process.start()
                  exit_code = process.wait()
       
                  if exit_code != 0:
                      self.logger.error("Exited with code %d", exit_code)
                      self.failure_flag.set()
       
                  i += 1
       
       
      def main():
          logging_config = resmokelib.parser._get_logging_config("console")
          resmokelib.logging.config.apply_config(logging_config)
          resmokelib.logging.flush.start_thread()
       
          threads = []
          failure_flag = threading.Event()
       
          for i in xrange(16):
              t = threading.Thread(target=Worker(i, failure_flag))
              # Do not wait for tests to finish executing if interrupted by the user.
              t.daemon = True
              t.start()
              threads.append(t)
       
          for t in threads:
              t.join()
       
          if failure_flag.is_set():
              # Exit with a nonzero return code so it's easy to identify from a task failure in Evergreen
              # whether an invalid free() occurred.
              sys.exit(1)
       
      if __name__ == "__main__":
          main()
      

      Show
      The "invalid free() " issue has only been observed on the Amazon Linux and SUSE 12 build variants. The following script aggressively spawns subprocesses and start threads to asynchronously consume their output exactly as is done in resmoke.py (just without running any actual tests!) It fails ~10% of the time on the Amazon Linux build variant and ~20% of the time on SUSE 12. #!/usr/bin/env python   from __future__ import absolute_import   import logging import os.path import sys import threading   # Get relative imports to work when the package is not installed on the PYTHONPATH. if __name__ = = "__main__" and __package__ is None : sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) from buildscripts import resmokelib     # Copied from Python 2.7. class NullHandler(logging.Handler): """ This handler does nothing. It's intended to be used to avoid the "No handlers could be found for logger XXX" one-off warning. This is important for library code, which may contain code to log events. If a user of the library does not configure logging, the one-off warning might be produced; to avoid this, the library developer simply needs to instantiate a NullHandler and add it to the top-level logger of the library module or package. """ def handle( self , record): pass   def emit( self , record): pass   def createLock( self ): self .lock = None     class Worker( object ):   ITERATIONS = 10000   def __init__( self , worker_num, failure_flag): self .name = "worker%d" % (worker_num) self .failure_flag = failure_flag   exec_logger = resmokelib.logging.loggers.EXECUTOR self .logger = resmokelib.logging.loggers.new_logger( self .name, parent = exec_logger)   def __call__( self ): i = 0 while not self .failure_flag.is_set(): if i > = Worker.ITERATIONS: break   # Suppress output of the subprocess because it isn't that interesting; we only care if # an invalid free() occurs. process_logger = resmokelib.logging.loggers.new_logger( self .name) process_logger.addHandler(NullHandler())   process = resmokelib.core.process.Process(process_logger, args = [ "echo" , "iter%d" % (i)]) process.start() exit_code = process.wait()   if exit_code ! = 0 : self .logger.error( "Exited with code %d" , exit_code) self .failure_flag. set ()   i + = 1     def main(): logging_config = resmokelib.parser._get_logging_config( "console" ) resmokelib.logging.config.apply_config(logging_config) resmokelib.logging.flush.start_thread()   threads = [] failure_flag = threading.Event()   for i in xrange ( 16 ): t = threading.Thread(target = Worker(i, failure_flag)) # Do not wait for tests to finish executing if interrupted by the user. t.daemon = True t.start() threads.append(t)   for t in threads: t.join()   if failure_flag.is_set(): # Exit with a nonzero return code so it's easy to identify from a task failure in Evergreen # whether an invalid free() occurred. sys.exit( 1 )   if __name__ = = "__main__" : main()
    • Sprint:
      Query F (02/01/16)
    • Linked BF Score:
      0

      Description

      The _posixsubprocess C extension module was introduced in CPython 3.2.0 to resolve the issues (e.g. potential to hang) when using the subprocess module from multiple threads concurrently. resmoke.py attempted to work around these thread-safety issues with the subprocess module in Python 2.6 and Python 2.7 by guarding calls to subprocess.Popen() with a lock. However, it turns out that the lock is insufficient for avoiding corruption of Python's internal data structure for thread-local storage in the child process:

      [2016/01/13 06:24:33.816] *** Error in `python': free(): invalid size: 0x00007f0044000b20 ***
      [2016/01/13 06:24:33.818] ======= Backtrace: =========
      [2016/01/13 06:24:33.818] /lib64/libc.so.6(+0x730bf)[0x7f00a54f30bf]
      [2016/01/13 06:24:33.818] /lib64/libc.so.6(+0x7892e)[0x7f00a54f892e]
      [2016/01/13 06:24:33.818] /lib64/libc.so.6(+0x79636)[0x7f00a54f9636]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyThread_ReInitTLS+0x4a)[0x7f00a5b530ca]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyOS_AfterFork+0x10)[0x7f00a5ae5119]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0x10ef86)[0x7f00a5b53f86]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5d1)[0x7f00a5b0c6e1]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x4fe)[0x7f00a5b1329e]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0xc73)[0x7f00a5b0cd83]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x4fe)[0x7f00a5b1329e]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0xb24ab)[0x7f00a5af74ab]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x46)[0x7f00a5af26c6]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0xae5aa)[0x7f00a5af35aa]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x46)[0x7f00a5af26c6]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0xbe7f9)[0x7f00a5b037f9]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0xbde2a)[0x7f00a5b02e2a]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x46)[0x7f00a5af26c6]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x144e)[0x7f00a5b0d55e]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x221)[0x7f00a5b12fc1]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0xc73)[0x7f00a5b0cd83]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x221)[0x7f00a5b12fc1]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0xb24ab)[0x7f00a5af74ab]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x46)[0x7f00a5af26c6]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0xae5aa)[0x7f00a5af35aa]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x46)[0x7f00a5af26c6]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0x6e792)[0x7f00a5ab3792]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x46)[0x7f00a5af26c6]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x21f5)[0x7f00a5b0e305]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x29cc)[0x7f00a5b0eadc]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x29cc)[0x7f00a5b0eadc]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x221)[0x7f00a5b12fc1]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0xb222f)[0x7f00a5af722f]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x46)[0x7f00a5af26c6]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(+0xae5aa)[0x7f00a5af35aa]
      [2016/01/13 06:24:33.818] /usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x46)[0x7f00a5af26c6]
      [2016/01/13 06:24:33.819] /usr/lib64/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x30)[0x7f00a5b0bc40]
      [2016/01/13 06:24:33.819] /usr/lib64/libpython2.7.so.1.0(+0x9f25e)[0x7f00a5ae425e]
      [2016/01/13 06:24:33.819] /lib64/libpthread.so.0(+0x80a4)[0x7f00a58300a4]
      [2016/01/13 06:24:33.819] /lib64/libc.so.6(clone+0x6d)[0x7f00a55657fd]
      

      The actual error message can vary, but the invalid free() triggers a SIGABRT and causes the child process to exit with a return code of -6:

      • "python: free(): invalid pointer"
      • "python: double free or corruption"
      • "python: munmap_chunk(): invalid pointer"

      The version of the subprocess module in Python 3.2 is available as a separate package called subprocess32. Using it will resolve the memory corruption issue because the call to PyOS_AfterFork() (and thus the call to PyThread_ReInitTLS()) is skipped when preexec_fn is None.

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: