[SERVER-5541] crash in C++ client driver during shutdowing primary mongo server from repset Created: 07/Apr/12 Updated: 11/Jul/16 Resolved: 11/Apr/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Client |
| Affects Version/s: | 2.0.2, 2.0.4 |
| Fix Version/s: | 2.0.5, 2.1.1 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Alexander Borodetsky | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
CentOS 6 |
||
| Operating System: | Linux |
| Participants: |
| Description |
|
I was testing stability of my client which used mongo server and client was crashed in mongo C++ client driver at moment of shutdowing primary mongo server. Here is stack trace of crash: Guide to reproduce this: Make replicaSet configuration like this: , , { "_id" : 2, "host" : "10.68.11.138:27018", "votes" : 0, "priority" : 0 } ] Shutdown server #1 and #2 (which have no votes) Then shutdown last standing primary #0 and immediately make request from client. May be it because I use old C++ client 2.0.2-pre with server 2.0.4 ? |
| Comments |
| Comment by auto [ 17/Apr/12 ] |
|
Author: {u'login': u'', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: |
| Comment by Randolph Tan [ 17/Apr/12 ] |
|
You're welcome. |
| Comment by Alexander Borodetsky [ 17/Apr/12 ] |
|
Thanx a lot for your fix and assistance. |
| Comment by Randolph Tan [ 11/Apr/12 ] |
|
Hi, The new commit now returns whatever the query method returns you, for consistency. So your client code should be prepared to handle null pointers as you would when using DBCLientConnection. |
| Comment by auto [ 11/Apr/12 ] |
|
Author: {u'login': u'', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: |
| Comment by Alexander Borodetsky [ 09/Apr/12 ] |
|
assert didn't solve this issue properly. This fix leads to calling abort() on windows or raise(SIGTRAP) on linux for debug build. And my server abnormally stops. But it lead to returning NULL cursor from DBClientReplicaSet::query without throwing any exception. It happens because checkMaster()->query() is called after three attempts of calling checkSlaveQueryResult(). And so checkMaster()->query() return NULL cursor too. So my question is: Is it correct that DBClientReplicaSet::query return NULL cursor? |
| Comment by auto [ 09/Apr/12 ] |
|
Author: {u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}Message: assert that we have a cursor rather than segv |
| Comment by Eliot Horowitz (Inactive) [ 08/Apr/12 ] |
|
pushed a possible fix to 2.0 |
| Comment by auto [ 08/Apr/12 ] |
|
Author: {u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}Message: assert that we have a cursor rather than segv |
| Comment by Alexander Borodetsky [ 07/Apr/12 ] |
|
The crash caused by following: DBClientReplicaSet::checkSlaveQueryResult( auto_ptr<DBClientCursor> result ) is calling with NULL value of argument "result" Here is call stack: ...}) Line 798 + 0xc bytes C++ , int nToReturn=150, int nToSkip=0, const mongo::BSONObj * fieldsToReturn=0x00000000, int queryOptions=4, int batchSize=0) Line 753 + 0xa6 bytes C++ |
| Comment by Alexander Borodetsky [ 07/Apr/12 ] |
|
Any other gdb command return "no debug info" error |
| Comment by Alexander Borodetsky [ 07/Apr/12 ] |
|
Unfortunately I have some troubles with symbol on my building system. So I have only backtarce (in main description) and this info: (gdb) info frame Maybe you have any ideas about gdb? |
| Comment by Alexander Borodetsky [ 07/Apr/12 ] |
|
Yes. I try doing it now. Wait a minute. I have no symbols on my test station. I need to transfer core-dump to build station. |
| Comment by Eliot Horowitz (Inactive) [ 07/Apr/12 ] |
|
Can you run in gdb so you can get line number of seg fault? |
| Comment by Alexander Borodetsky [ 07/Apr/12 ] |
|
I'm catching exception. But only std::exception, not SIGSEGV )) |
| Comment by Alexander Borodetsky [ 07/Apr/12 ] |
|
Program received signal SIGSEGV, Segmentation fault. But if I wait a few second before doing request, client handle it (shutdowning server) with exception (I suppose it is correct behaviour ) ) |
| Comment by Eliot Horowitz (Inactive) [ 07/Apr/12 ] |
|
Was it a crash or just an exception being thrown? |
| Comment by Alexander Borodetsky [ 07/Apr/12 ] |
|
I have got C++ client driver 2.0.4 source code and rebuild my solution ( client from point of view of mongodb ) with it. |
| Comment by Alexander Borodetsky [ 07/Apr/12 ] |
|
Note. It's important to make request immediately after shutdowning server. Otherwise (if you are late) shutdown of last server will be handled by client with correct exception |