Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-1442

PyMongo can segfault with mod_wsgi httpd reloads

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: 3.3.1
    • Component/s: None
    • Labels:
      None
    • Environment:
      2.7.13 on Fedora 25
      mongoengine==0.10.6
      pymongo==3.3.1
      httpd-2.4.27-3.fc25.x86_64
      mod_wsgi-4.5.15-1.fc25.x86_64

      We've investigated a reproducable segfault in our application which uses PyMongo. The segfault occurs only when putting the webserver under load and issuing httpd reloads. This should be forever safe, but occasionally it segfaults.

      To reproduce for pymongo you can just read and write a lot such that time is spent in the background thread. Then have mod_wsgi reload over and over.

      From one of the coredumps I got from our application shows that
      1) Python terminated via segfault
      2) The segfault was on _PyTrash_thread_destroy_chain () in Thread 1
      2) Thread 1 is the PyMongo daemon thread which has this python backtrace:

      Traceback (most recent call first):
        File "/usr/lib64/python2.7/site-packages/pymongo/periodic_executor.py", line 103, in _run
          time.sleep(self._min_interval)
        File "/usr/lib64/python2.7/threading.py", line 757, in run
          self.__target(*self.__args, **self.__kwargs)
        File "/usr/lib64/python2.7/threading.py", line 804, in __bootstrap_inner
          self.run()
        File "/usr/lib64/python2.7/threading.py", line 777, in __bootstrap
          self.__bootstrap_inner()
      

      Note that we had many coredumps that showed the same _PyTrash segfault in another of our dependencies gofer. This fix resolved all of those issue: https://github.com/jortel/gofer/pull/78

      I have the coredump I'm referring to saved, but it's 250 MB and I can't upload it here due to upload limits. I did attach the "t a a bt", "t a a py-bt", and the "coredump_gdb_startup_output" attachments as text files.

      The investigation of our application is done here: https://pulp.plan.io/issues/3129#note-28
      We also filed an issue against upstream mod_wsgi who identified the issue is that we were leaving daemon threads to die on their own which can segfault. See the upstream mod_wsgi convo here: https://github.com/GrahamDumpleton/mod_wsgi/issues/250

        1. coredump_gdb_startup_info.txt
          2 kB
        2. coredump_t_a_a_bt.txt
          15 kB
        3. coredump_t_a_a_py-bt.txt
          2 kB

            Assignee:
            jesse@mongodb.com A. Jesse Jiryu Davis
            Reporter:
            bmbouter Brian Bouterse
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: