-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 2.6.3
-
Component/s: None
-
None
UPDATE: Although the presentation of this issue was new to us, it's a known bug in PyMongo before 2.7. I didn't realize I'd fixed it, along with a large class of similar bugs related to replica set reconnection, when I rewrote MongoClient for PYTHON-487.
PyMongo 2's rapidly obsolescing MongoClient can get stuck trying to authenticate to a recovering member, even if a primary is available. There are a few ways this can happen, all intricate. The particular case in which this was reported was:
1. Replica set with a primary "A" and a resyncing member "B"
2. MongoClient started with connection string "A,B" and no "replicaSet" keyword (also note, not PyMongo 2's MongoReplicaSetClient)
3. MongoClient.database.authenticate("user", "password") succeeds against the primary
4. An operation ("find_one" or whatever) fails against the primary with network error
5. On the next operation, MongoClient attempts rediscovery by calling "ismaster" on A and B again. Since it has cached the "user" / "password" credentials, unfortunately, it attempts authentication against each node as it connects.
6. When it tries to reach host "B", B is resyncing and doesn't have the user's record yet, so auth fails.
7. MongoClient throws OperationFailure("auth fails"), and continues to do so even after the primary becomes available again.
This can be reproduced with MockupDB. First "pip install git+git://github.com/ajdavis/mongo-mockup-db.git". MockupDB requires PyMongo 3, so run it in a separate virtualenv. Start a mock replica set with two members:
from time import sleep from mockupdb import MockupDB, OpQuery primary, recovering = servers = [MockupDB(port) for port in 2000, 2001] for server in servers: server.verbose = True server.run() hosts = [server.address_string for server in servers] primary.autoresponds( 'ismaster', ismaster=True, setName='rs', hosts=hosts) recovering.autoresponds( 'ismaster', ismaster=False, secondary=False, setName='rs', hosts=hosts) # Recovering member hasn't replicated user records yet: nonce ok, auth fails. recovering.autoresponds('getnonce', nonce='abcd') recovering.autoresponds('authenticate', ok=0, code=18, errmsg='auth fails') # Initial auth succeeds on primary, next op fails with network error. primary.receives('getnonce', timeout=100).ok(nonce='abcd') primary.receives('authenticate').ok() primary.receives().hangup() # Primary returns but MongoClient won't use it. primary.autoresponds(OpQuery, {}) primary.autoresponds('getnonce', nonce='abcd') primary.autoresponds('authenticate') # Wait for Ctrl-C. sleep(1000)
Then connect a client with PyMongo 2 (this was tested with 2.6.3, but all recent PyMongo 2 versions will act the same):
import traceback from time import sleep from pymongo import MongoClient client = MongoClient('localhost:2000,localhost:2001') client.db.authenticate('user', 'password') for _ in range(15): try: print client.db.collection.find_one() except: traceback.print_exc() sleep(0.5)
The client's initial auth succeeds, then each find_one fails with an OperationFailure and characteristic traceback:
OperationFailure: command SON([('authenticate', 1), ('user', u'user'), ('nonce', u'abcd'), ('key', u'3cb54e6d2ddc126d9fb2445b068020ab')]) failed: auth fails Traceback (most recent call last): File "CS-20300-client.py", line 11, in <module> print client.db.collection.find_one() File "pymongo/collection.py", line 604, in find_one for result in self.find(spec_or_id, *args, **kwargs).limit(-1): File "pymongo/cursor.py", line 904, in next if len(self.__data) or self._refresh(): File "pymongo/cursor.py", line 848, in _refresh self.__uuid_subtype)) File "pymongo/cursor.py", line 782, in __send_message res = client._send_message_with_response(message, **kwargs) File "pymongo/mongo_client.py", line 1038, in _send_message_with_response sock_info = self.__socket() File "pymongo/mongo_client.py", line 777, in __socket self.__check_auth(sock_info) File "pymongo/mongo_client.py", line 469, in __check_auth sock_info, self.__simple_command) File "pymongo/auth.py", line 214, in authenticate auth_func(credentials[1:], sock_info, cmd_func) File "pymongo/auth.py", line 194, in _authenticate_mongo_cr cmd_func(sock_info, source, query) File "pymongo/mongo_client.py", line 607, in __simple_command helpers._check_command_response(response, None, msg) File "pymongo/helpers.py", line 147, in _check_command_response raise OperationFailure(msg % errmsg, code)
PyMongo 3's MongoClient, on the other hand, behaves as designed: it throws AutoReconnect once, then successfully reconnects to the primary.
This shows a couple bugs in MongoClient. First, it shouldn't attempt auth against a member in an unknown state during reconnection. PyMongo 3's MongoClient does not. Second, if there is any failure while rediscovering the state of a member, MongoClient shouldn't stay pinned to that member's host and port afterward. Again, this is fixed in PyMongo 3.
This bug has existed in PyMongo 2 for as long as PyMongo has supported authentication.
A possible solution is to update this code in PyMongo 2's MongoClient.__find_node:
for candidate in candidates: try: node, ismaster, isdbgrid, res_time = self.__try_node(candidate) # ... snip ... return node except OperationFailure: # The server is available but something failed, probably auth. raise except Exception, why: errors.append(str(why))
This code was written assuming that an auth failure is a permanent and global condition that should be raised at once, rather than transient and particular to the RS member being tried, the way a network error might be. MongoClient might instead treat OperationFailure as it does other exceptions: keep trying more nodes.
I've briefly tested this and it fixes this bug, at the cost of backward compatibility: if we make the change, auth errors from MongoClient's constructor will raise AutoReconnect instead of the expected OperationFailure.
Best in my opinion to leave PyMongo as-is and encourage users to upgrade.
- related to
-
PYTHON-918 Auth err from resyncing member prevents primary discovery
- Closed