[CXX-912] The logic of func DBClientReplicaSet::isFailed() in dbclient_rs.h Created: 18/May/16  Updated: 25/May/16  Resolved: 25/May/16

Status: Closed
Project: C++ Driver
Component/s: None
Affects Version/s: legacy-1.1.0, legacy-1.1.1
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: ethan zhang Assignee: Andrew Morrow (Inactive)
Resolution: Done Votes: 0
Labels: legacy-cxx
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

Hi! I am using DBClientReplicaSet to querying a mongodb replicaset, and I found a bug in this class. In my project, I use client connection as a long keep alive connection. And everytime I reuse this client again, I must check this client whether it is ok. So I use func DBClientReplicaSet::isFailed() to check the client whether it is ok or not. But I found it always return true. I check the code below in dbclient_rs.h

dbclient_rs.h

    virtual bool isFailed() const {
        return !_master || _master->isFailed();
    }   

In my project, I always query mongodb with the secondary node using the method DBClientReplicaSet::slaveConn(), so the client did not have the chance to init the _master member.
I think in method isFailed(), not only must we check the _master member, but also the _lastSlaveOkConn member. If one of the two member's status is OK, the method should return true.
If you think it indeed a bug, I will commit a bug fix later.

thanks.



 Comments   
Comment by Andrew Morrow (Inactive) [ 24/May/16 ]

Hi -

  • If you want to communicate with a secondary, the correct way to do so is to set a read preference in your Query by passing it an object of type mongo::ReadPreference. This will cause DBClientReplicaSet to route your query appropriately, since not all commands are actually valid to run on a secondary. None of the above requires that you use the DBClientReplicaSet interface. The DBClientReplicaSet type should be considered an internal implementation detail of the driver.
  • The 26compat driver (which is even older than the legacy driver) does have a connection pool, and it uses isFailed when deciding whether a connection is valid while returning to the pool. In fact, my expectation is that the isFailed method was implemented as it is to explicitly support the semantics of the old connection pool, whatever those were. That said, do you actually need a global pool? Do you ever need more than one connection in any given thread? Perhaps you would do better to use per-thread connections via a thread-specific variable.
  • Sure, if isFailed has the semantics you want, then use it. But my assertion is that your observed effect, that _master is never set so the function always returns true, is a side effect of the fact that you are mis-using the class by directly manipulating it. If, instead, you achieve your intended goal via the DBClientBase interface and use the read preference facility as designed, the implementation details of DBClientReplicaSet and the ReplicaSetMonitor will render the isFailed method meaningful.

Please understand that the Legacy C++ driver is just that - legacy. It is headed for end-of-life in the near future. It is very unlikely to support any new MongoDB features after those features incorporated in the lergacy-1.1.0 release to support MongoDB 3.2. The arguments in favor of making any change must be extraordinarily compelling: the driver has a wide install base that relies on its existing semantics (whatever those are and however broken they may be), so changing those semantics in a stable release is very risky, and we also have new stable drivers in C and C++11 that offer much better APIs with unified semantics and high quality implementations. The legacy C++ driver has none of these things.

If you want a stable driver with proper semantics, continued support, and which will reflect ongoing development in MongoDB server releases, you should use either the MongoDB C driver, or the new C++11 driver.

Comment by ethan zhang [ 24/May/16 ]
  • First, Only use the API exposed as DBClientBase is not enough. For example, the slaveConn() method in the DBClientReplicaSet class. If we want to query a mongodb replicaset cluster through a secondary, We must use slaveConn() method which was not in the DBClientBase API. Besides, I think all public method in the DBClientReplicaSet class should have be Open API.
  • Second, I use isFailed() because I have try writing a connection pool in my project. Every time we release the connection back to the connection pool, we must check the connection is still OK, so I called the isFailed() method. If the connection is a DBClientBase instance, everything is OK. But if the connection is a DBClientReplicaSet instance, sometimes it goes wrong. Because isFailed() implemented by DBClientReplicaSet only check the connection to the master node in Mongodb replica set. However, in DBClientReplicaSet instance there have two member of connections, one was a connection to the master node, the other is a connection to the secondary node. So, if the DBClientReplicaSet instance only have a connection to the secondary node, it will failed, but as a fact, the connection is still OK. Would you please read the driver code again, I think it will be more clear.
  • Third, in your opinion above, you suggest using isStillConnected() instead. But again read the code, you will find that isStillConnected() method will retry connect the connection which the DBClientReplicaSet instance handled, and isFailed() only check the connection handled by the DBClientReplicaSet instance is still OK. That is what we needed. We only need to check the connection is still OK, but not reconnection it.

Andrew, If you really want to close this issue, please do have someone else have read this issue first, Thankx.

Comment by Andrew Morrow (Inactive) [ 22/May/16 ]

Several comments:

  • You should not be making direct calls to the concrete class DBClientReplicaSet. Instead, just use the API exposed as DBClientBase. Internally, the DBClientConnection and DBClientReplicaSet classes will do the right things.
  • You haven't really explained why you need to call isFailed. What question are you trying to answer or state are you trying to discover? If the driver is functioning properly, and you are connected to a replica set, your DBClientBase object will automatically follow the master, and it will properly route operations based on read preference. It is important to note that individual operations may fail, and the driver does not retry.
  • It is also important to keep in mind that since we are working with a distributed system, even if you call a function to observe the state now (like isFailed or isConnected), you can't rely on that state still being true when you try to use the connection. In other words:

auto conn = connStr.connect(...);
if (conn->isStillConnected()) {
    // This still might fail - we could have become non-connected after checking but before running the command.
    conn->runCommand(...);
}

Alternatively, writing it with isFailed:

auto conn = connStr.connect(...);
if (!conn->isFailed()) {
    // This still might fail - we could have become failed after checking but before running the command.
    conn->runCommand(...);
}

From the perspective of the driver maintainer, there is no documentation for the isFailed function, so its intended semantics are unclear. It is possible that the implementation for DBClientReplicaSet is in fact broken. However, I'm reluctant to change it at this point - the driver is stable, and whatever semantics that function has had, it has had for a long time. Changing it risks breaking those semantics in the case that the function is being used in the wild, no matter how odd the semantics or misguided such usage may be.

So, I'm inclined to close this ticket as Wont Fix, but please let me know if you have any additional comments or questions. If you do disagree with the above analysis, it would be very helpful if you could post some code showing how you are connecting to the server, and how and why you are using isFailed.

Comment by ethan zhang [ 20/May/16 ]

Hi, Andrew.
I use old legacy driver because many many project in my team using the legacy-26compat driver. Recently, we are starting to update out mongodb version up to 3.0, so we have made some test with legacy driver which compatible with the 26compat driver.
The way you mentation to connect to a replica set was just I used in my test code.
My only question is the isFailed() method in the class DBClientReplicaSet,which will always return false even if the client has connected with the a secondary mongodb node.
So, I think it is a bug when using this method. It only check the connection with the primary node but missing the secondary node.

Is it not welcomed to fix some some bugs with the old legacy driver in the mongodb community? I wish to fix the bugs in the driver, so that other develop will not be face the same question.

My English written is not so much, So Which part is not clear please point to me~

Comment by Andrew Morrow (Inactive) [ 18/May/16 ]

You should not be making direct use of the DBClientReplicaSet class. Instead, the correct way to connect to a replica set is via the following procedure:

    const std::string URI = // some mongodb URI
    ConnectionString connStr = ConnectionString::parse(URI);
    std::string errorMessage;
    DBClientBase* conn = connStr.connect(&errorMessage);

Could you please try connecting this way, via ConnectionString::parse and ConnectionString::connect?

Also, is there a particular reason you are using the old legacy driver? I can't recommend strongly enough that you switch to the new C++11 driver if at all possible. It offers a greatly improved API and much higher quality of implementation.

Generated at Wed Feb 07 22:00:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.