Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-961

initializing multiple connections from multiprocessing threads causes database connections to fail to be created

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.1
    • Affects Version/s: 3.0.2
    • Component/s: None
    • Labels:
    • Environment:
      Mac OSX, macbook pro.

      Wrote a blog post on this, figuring it was a feature with the new mongo driver, rather than a bug - but apparently that is not so. (Thanks A. Jesse Jiryu Davis) I don't have much more to contribute to the issue, but if there's value in mentioning it here, I'm happy to contribute. Here is my post, slightly reformatted:

      Was running 18 processes simultaneously, of which at least 9 of them require some form of database interaction with MongoDB, but discovered that I was frequently getting an error:

       File "something.py", line 177, in flush
      File "/Users/afejes/sandboxes/pipeline4/lib/python2.7/site-packages/pymongo/bulk.py", line 582, in execute
        return self.__bulk.execute(write_concern)
      File "/Users/afejes/sandboxes/pipeline4/lib/python2.7/site-packages/pymongo/bulk.py", line 430, in execute
        with client._socket_for_writes() as sock_info:
      File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
        return self.gen.next()
      File "/Users/afejes/sandboxes/pipeline4/lib/python2.7/site-packages/pymongo/mongo_client.py", line 663, in _get_socket
        server = self._get_topology().select_server(selector)
      File "/Users/afejes/sandboxes/pipeline4/lib/python2.7/site-packages/pymongo/topology.py", line 121, in select_server
      File "/Users/afejes/sandboxes/pipeline4/lib/python2.7/site-packages/pymongo/topology.py", line 97, in select_servers
      ServerSelectionTimeoutError: No servers found yet

      pymongo drivers (3.0.2) appear to have changed their initialization (compared to 2.8.x), so that they no longer actually create the connection pool when you initialize them. You say:

      mongo = MongoWrapper()

      and they go off and do a non-blocking initialization of everything pymongo needs to start the server. All is good.

      However, if you’re doing multiprocessing, the temptation is to allow each of your threads to launch a new instance of the MongoWrapper. Indeed, I’ve done that before with 2.8.x series of pymongo, and it worked well. In this case, pymongo 3.0.2 REALLY doesn’t like it, and you’ll get the “No Servers found yet” error when you try to retrieve results from your database. Oddly enough, it's especially hard to track down because of the serverSelectionTimeoutMS parameter.

      If you don’t put it there, the default value is 30 seconds… Which means your application sits there, waiting to see if the mongo database will connect for 30 seconds, once it realizes that the database is missing. When it finally does fail, you’ll get the error above… 30 seconds after your database went down. That’s cool… except when the issue is actually not related to the database going down.

      In my case, I solved the issue by ensuring that each thread does not initialize a new instance of MongoWrapper. Thus, I have the parent thread create one instance of MongoWrapper, and then pass that as a parameter to the processes. Tada! – the error disappears, and your program starts to run, instead of failing and waiting 30 seconds to tell you.

            anna.herlihy@mongodb.com Anna Herlihy (Inactive)
            apfejes Anthony Fejes
            0 Vote for this issue
            10 Start watching this issue