[CDRIVER-432] Memory leak every ~100 connections Created: 27/Sep/14 Updated: 08/Jan/24 Resolved: 01/Oct/14 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 1.0.2 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | ech0s7r | Assignee: | Adam Midvidy |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
OS X 10.9.4 and Linux |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Description |
|
I have slightly modified the example-client.c, in order to make repeated connections to the mongoDB and i found this two major problems: 1) a memory leak of 4kb every ~100 connection I am very concerned about the problem number 1, because the process begins with a memory usage of about 800kb and after just seconds, get over 2MB! In the attached file example-client.c changed. Thanks |
| Comments |
| Comment by Denis Gladkikh [ 01/Oct/14 ] |
|
Adam thank you again for explanation. |
| Comment by Adam Midvidy [ 01/Oct/14 ] |
|
Yes, with the usleep in place the failure is less deterministic because the OS has additional time to cleanup connection state between connection attempts. If you make the sleeps long enough, the OS will have sufficient time to clean up such that the failures should not happen at all. To go into even more detail, after you close the socket the OS will put the socket into a TIME_WAIT state. If the OS were to immediately make the socket available for a new connection there would be a risk of getting packets that were meant for a prior connection, which could cause errors. This is why SO_REUSEADDR is generally a bad idea. |
| Comment by Adam Midvidy [ 01/Oct/14 ] |
|
No problem, glad to help. You can watch |
| Comment by Denis Gladkikh [ 01/Oct/14 ] |
|
Adam, actually I have one more question, do you how it is possible that for example after 1000 connection connection #1001 fails and connection #1002 succeeds? Does it mean that OS can do some cleanup work between 1001 and 1002 or driver does that? |
| Comment by Denis Gladkikh [ 01/Oct/14 ] |
|
Adam, thank you for explanation. Just wanted to understand the root cause of this issue. Glad to hear that we should not see this issue if we will keep connections around. |
| Comment by Adam Midvidy [ 01/Oct/14 ] |
|
The OS still retains some state per-socket after the socket is closed in the client process, so you will still get resource exhaustion regardless of whether the program closes the socket. If you want to check for yourself, remove the usleep call from the example program, recompile , and try rerunning the program until a socket fails, and then immediately rerun it. It should fail immediately the second time. |
| Comment by Denis Gladkikh [ 01/Oct/14 ] |
|
Adam, but in this example we don't keep clients alive. We kill them, should mongoc_client_destroy actually close connection/sockets? |
| Comment by Adam Midvidy [ 01/Oct/14 ] |
|
I also split this into 2 tickets, one for the memory leak, and another for the connection failure. See |
| Comment by Adam Midvidy [ 01/Oct/14 ] |
|
Hi dgladkikh_splunk The error you are seeing here is that the operating system is refusing to create so many new sockets to the same host. The driver is capable of many simultaneous connections to different hosts, but the problem here is that they are all to the same host. I think ech0s7r is correct that the real fix here is to reuse the same connection each time you create a cursor. |
| Comment by ech0s7r [ 01/Oct/14 ] |
|
Denis Gladkikh, yes it fix only problem (1). |
| Comment by Denis Gladkikh [ 01/Oct/14 ] |
|
Adam Midvidy, it seems like that this fix https://github.com/mongodb/libbson/pull/92 addresses only problem (1), but not a problem (2), am I right? For example I modified a little example provided by @ech0s7r to http://pastebin.com/Za194ius and I still see Cursor Failures. In my case I have seen 50 failures on 50000 iterations. So every 1000 connections. |
| Comment by Adam Midvidy [ 29/Sep/14 ] |
|
See proposed fix here: https://github.com/mongodb/libbson/pull/92 |
| Comment by ech0s7r [ 27/Sep/14 ] |
|
I wrote another little test case http://pastebin.com/JHKmX49Q and i have seen that the memory leak occours also only calling mongoc_client_get_collection/mongoc_client_destroy repeatedly. |