[CDRIVER-1413] Segfault on getaddrinfo when client connects on TCP Created: 25/Jul/16  Updated: 03/May/17  Resolved: 26/Jul/16

Status: Closed
Project: C Driver
Component/s: Bulk API
Affects Version/s: 1.3.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Thomas Champagne Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian Jessie with libc 2.19-18+deb8



 Description   

On a multithread client, I have a segfault with bulk operations.
In the stacktrace, you can see that the problem happens when mongoc client call the gethostbyname method in libc.

#0  0x00007ffff744b833 in _IO_un_link () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff743f775 in fclose () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff2072f11 in _nss_files_gethostbyname3_r ()
   from /lib/x86_64-linux-gnu/libnss_files.so.2
#3  0x00007ffff2073524 in _nss_files_gethostbyname2_r ()
   from /lib/x86_64-linux-gnu/libnss_files.so.2
#4  0x00007ffff74cfe39 in gethostbyname2_r ()
   from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007ffff74aa4bf in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007ffff74abafd in getaddrinfo () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x00007ffff62f0617 in mongoc_client_connect_tcp (uri=0x7ffff777d780,
    uri@entry=0x7fffe4000a80, host=0x7fffe40031c8, error=0x7ffff2a7ac40)
    at src/mongoc/mongoc-client.c:118
#8  0x00007ffff62f0a36 in mongoc_client_default_stream_initiator (
    uri=0x7fffe4000a80, host=0x7fffe40031c8, user_data=0x7fffe4000950,
    error=0x7ffff2a7ac40) at src/mongoc/mongoc-client.c:310
#9  0x00007ffff62f5014 in _mongoc_cluster_add_node (error=0x7ffff2a7ac40,
    sd=0x7fffe40031c0, cluster=0x7fffe4000968)
    at src/mongoc/mongoc-cluster.c:1285
#10 mongoc_cluster_fetch_stream_pooled (error=0x7ffff2a7ac40,
    reconnect_ok=true, sd=0x7fffe40031c0, cluster=0x7fffe4000968)
    at src/mongoc/mongoc-cluster.c:1563
#11 _mongoc_cluster_stream_for_server_description (
    cluster=cluster@entry=0x7fffe4000968, sd=sd@entry=0x7fffe40031c0,
    reconnect_ok=reconnect_ok@entry=true, error=error@entry=0x7ffff2a7ac40)
    at src/mongoc/mongoc-cluster.c:1365
#12 0x00007ffff62f561d in _mongoc_cluster_stream_for_optype (
    cluster=0x7fffe4000968, optype=<optimized out>,
    read_prefs=<optimized out>, error=0x7ffff2a7ac40)
    at src/mongoc/mongoc-cluster.c:1701
#13 0x00007ffff62ef954 in mongoc_bulk_operation_execute (bulk=0x7fffe40028c0,
    reply=0x35000002, reply@entry=0x7fffe40008c0, error=0x1d621,
    error@entry=0x7ffff2a7ac40) at src/mongoc/mongoc-bulk-operation.c:427

In the doc, the getaddrinfo is described as "Thread safety".
Do you think that is a problem in mongoc driver or in the libc ?



 Comments   
Comment by A. Jesse Jiryu Davis [ 26/Jul/16 ]

Great, thanks for investigating!

Comment by Thomas Champagne [ 26/Jul/16 ]

After some investigations, I have found the source of problem.
I tried to create a simple program based on examples in driver documentation but I couldn't reproduce this issue.
And finally, I read again my first program and I saw that loads the mongodb function in a dynamic library via the dlopen function.
So I can reproduce the problem every time if the calls of mongodb routines are not in main program but in the dynamic library.
I have find some issues with this problem :
https://sourceware.org/bugzilla/show_bug.cgi?id=10652
http://stackoverflow.com/questions/26190860/threaded-shared-library-for-non-threaded-application

So, the solution is to link the main program with pthread (-lpthread) or define the LD_PRELOAD environment with the pthread library :

LD_PRELOAD=/lib/x86_64-linux-gnu/libpthread.so.0 ./myapp

So, I think it is not an issue in the mongodb driver but in the libc.

Comment by A. Jesse Jiryu Davis [ 25/Jul/16 ]

Thanks. Intermittent or every time? Can you send us a complete code example
that we can compile and run that reproduces the error reliably, please?
What's your connection string (the MongoDB uri)?

Generated at Wed Feb 07 21:12:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.