[CDRIVER-1534] A strange mongoc/libbson crash observed Created: 09/Sep/16  Updated: 03/May/17  Resolved: 10/Sep/16

Status: Closed
Project: C Driver
Component/s: None
Affects Version/s: 1.3.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Andrii Guriev Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 14.04 LTS, x64, 3.13.0-87-generic



 Description   

I use 1.3.5 C driver from a C app. here is what I got:

output:
mongo_set_field_int() - that's my func.

2016/09/09 17:53:01.0703: [20325]:    TRACE:   collection: ENTRY: mongoc_collection_update():1417
2016/09/09 17:53:01.0703: [20325]:    TRACE:       mongoc: ENTRY: _mongoc_write_result_init():1007
2016/09/09 17:53:01.0703: [20325]:    TRACE:       mongoc:  EXIT: _mongoc_write_result_init():1017
2016/09/09 17:53:01.0703: [20325]:    TRACE:       mongoc: ENTRY: _mongoc_write_command_init_update():228
2016/09/09 17:53:01.0703: [20325]:    TRACE:       mongoc: ENTRY: _mongoc_write_command_update_append():116
2016/09/09 17:53:01.0703: [20325]:    TRACE:       mongoc:  EXIT: _mongoc_write_command_update_append():136
2016/09/09 17:53:01.0703: [20325]:    TRACE:       mongoc:  EXIT: _mongoc_write_command_init_update():242
2016/09/09 17:53:01.0703: [20325]:    TRACE:   collection: ENTRY: _mongoc_collection_write_command_execute():115
2016/09/09 17:53:01.0703: [20325]:    TRACE:      cluster: ENTRY: _mongoc_cluster_stream_for_optype():1685
2016/09/09 17:53:01.0703: [20325]:    TRACE:      cluster: ENTRY: _mongoc_cluster_stream_for_server_description():1353
2016/09/09 17:53:01.0703: [20325]:    TRACE:      cluster:  EXIT: _mongoc_cluster_stream_for_server_description():1377
2016/09/09 17:53:01.0703: [20325]:    TRACE:      cluster:  EXIT: _mongoc_cluster_stream_for_optype():1710
2016/09/09 17:53:01.0703: [20325]:    TRACE:       mongoc: ENTRY: _mongoc_write_command_execute():949
2016/09/09 17:53:01.0703: [20325]:    TRACE:       mongoc: ENTRY: _mongoc_write_command_update_legacy():637
2016/09/09 17:53:01.0703: [20325]:    TRACE:      cluster: ENTRY: mongoc_cluster_sendv_to_server():2157
2016/09/09 17:53:01.0703: [20325]:    TRACE:      cluster:  EXIT: mongoc_cluster_sendv_to_server():2268
2016/09/09 17:53:01.0703: [20325]:    TRACE:       client: ENTRY: _mongoc_client_recv_gle():540
2016/09/09 17:53:01.0703: [20325]:    TRACE:      cluster: ENTRY: mongoc_cluster_try_recv():2307
2016/09/09 17:53:01.0704: [20325]:    TRACE:      cluster: TRACE: mongoc_cluster_try_recv():2316 Waiting for reply from server_id "1"
2016/09/09 17:53:01.0715: [20325]:    TRACE:      cluster:  EXIT: mongoc_cluster_try_recv():2374
Segmentation fault (core dumped)
 
from gdb:
-----
Core was generated by `./mmqtt'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f4402f31a7f in bson_iter_double () from /usr/lib/libbson-1.0.so.0
(gdb) bt
#0  0x00007f4402f31a7f in bson_iter_double () from /usr/lib/libbson-1.0.so.0
#1  0x00007f4403162c93 in _mongoc_client_recv_gle (client=0xf5bff0, server_stream=0xf84820, gle_doc=0x7ffdee61dfa8, error=0x7ffdee61e888)
    at src/mongoc/mongoc-client.c:581
#2  0x00007f4403193d0d in _mongoc_write_command_update_legacy (command=0x7ffdee61e560, client=0xf5bff0, server_stream=0xf84820, database=0xf5a2dc "test",
    collection=0xf5a35c "test", write_concern=0xf5a8e0, offset=0, result=0x7ffdee61e600, error=0x7ffdee61e888) at src/mongoc/mongoc-write-command.c:727
#3  0x00007f4403194ca7 in _mongoc_write_command_execute (command=0x7ffdee61e560, client=0xf5bff0, server_stream=0xf84820, database=0xf5a2dc "test",
    collection=0xf5a35c "test", write_concern=0xf5a8e0, offset=0, result=0x7ffdee61e600) at src/mongoc/mongoc-write-command.c:982
#4  0x00007f440316ab9e in _mongoc_collection_write_command_execute (command=0x7ffdee61e560, collection=0xf5a250, write_concern=0xf5a8e0,
    result=0x7ffdee61e600) at src/mongoc/mongoc-collection.c:125
#5  0x00007f440316e0f3 in mongoc_collection_update (collection=0xf5a250, uflags=MONGOC_UPDATE_NONE, selector=0xf8b0d0, update=0xf8b790,
    write_concern=0xf5a8e0, error=0x7ffdee61ec30) at src/mongoc/mongoc-collection.c:1452
#6  0x000000000040760e in mongo_set_field_int (id=0x40e064 "intrnlsettings", field=0x40e058 "SalesPeriod", value=1473432781) at mongo.c:1453
#7  0x000000000040276d in update_sales_period () at main.c:112
#8  0x0000000000402eb9 in main (argc=1, argv=0x7ffdee61ef78) at main.c:382
 
(gdb) disassemble bson_iter_double
Dump of assembler code for function bson_iter_double:
   0x00007f4402f31a70 <+0>:    test   %rdi,%rdi
   0x00007f4402f31a73 <+3>:    je     0x7f4402f31a99 <bson_iter_double+41>
   0x00007f4402f31a75 <+5>:    mov    0x10(%rdi),%edx
   0x00007f4402f31a78 <+8>:    mov    (%rdi),%rax
   0x00007f4402f31a7b <+11>:    xorpd  %xmm0,%xmm0
=> 0x00007f4402f31a7f <+15>:    cmpb   $0x1,(%rax,%rdx,1)
   0x00007f4402f31a83 <+19>:    je     0x7f4402f31a90 <bson_iter_double+32>
   0x00007f4402f31a85 <+21>:    repz retq
   0x00007f4402f31a87 <+23>:    nopw   0x0(%rax,%rax,1)
   0x00007f4402f31a90 <+32>:    mov    0x18(%rdi),%edx
   0x00007f4402f31a93 <+35>:    movsd  (%rax,%rdx,1),%xmm0
   0x00007f4402f31a98 <+40>:    retq  
   0x00007f4402f31a99 <+41>:    lea    0x11a4f(%rip),%rax        # 0x7f4402f434ef
   0x00007f4402f31aa0 <+48>:    sub    $0x18,%rsp
   0x00007f4402f31aa4 <+52>:    lea    0xfd25(%rip),%r9        # 0x7f4402f417d0 <__func__.6033>
   0x00007f4402f31aab <+59>:    lea    0xf9e6(%rip),%rcx        # 0x7f4402f41498
   0x00007f4402f31ab2 <+66>:    lea    0xf1bf(%rip),%rdx        # 0x7f4402f40c78
   0x00007f4402f31ab9 <+73>:    mov    $0x365,%r8d
   0x00007f4402f31abf <+79>:    mov    %rax,(%rsp)
   0x00007f4402f31ac3 <+83>:    mov    0x219526(%rip),%rax        # 0x7f440314aff0
   0x00007f4402f31aca <+90>:    mov    $0x1,%esi
   0x00007f4402f31acf <+95>:    mov    (%rax),%rdi
   0x00007f4402f31ad2 <+98>:    xor    %eax,%eax
   0x00007f4402f31ad4 <+100>:    callq  0x7f4402f29ba0 <__fprintf_chk@plt>
   0x00007f4402f31ad9 <+105>:    callq  0x7f4402f29930 <abort@plt>
End of assembler dump.
 
(gdb) info registers
rax            0x0    0
rbx            0x57d2cccd    1473432781
rcx            0xf8c224    16302628
rdx            0x34f    847
rsi            0x350    848
rdi            0x7ffdee61dd80    140728602844544
rbp            0x7ffdee61df40    0x7ffdee61df40
rsp            0x7ffdee61dcd8    0x7ffdee61dcd8
r8             0x34f    847
r9             0x34b    843
r10            0x34f    847
r11            0x2    2
r12            0x401db0    4201904
r13            0x7ffdee61ef70    140728602849136
r14            0x0    0
r15            0x0    0
rip            0x7f4402f31a7f    0x7f4402f31a7f <bson_iter_double+15>
eflags         0x10202    [ IF RF ]
cs             0x33    51
ss             0x2b    43
ds             0x0    0
es             0x0    0
fs             0x0    0
gs             0x0    0

what I do in my app is a really simple thing:

int mongo_set_field_int(char *id, char *field, u32 value)
{
        bson_t *query = NULL;
        bson_t *update = NULL;
        bson_error_t error;
        int rv = 0;
 
        debug("%s() \n", __func__);
 
        query = bson_new();
        if (!query)
        {
                error("can't allocate new bson for query, exiting\n");
                rv = 0; return rv;
        }
        BSON_APPEND_UTF8(query, "intrnldocname", id);
 
        if (!strcmp(field, "SalesPeriod"))
                update = BCON_NEW("$set", "{", field, BCON_INT64(value), "}");
        else
                update = BCON_NEW("$set", "{", field, BCON_INT32(value), "}");
        if (!update)
        {
                error("can't allocate new bson for update, exiting\n");
                rv = 0; return rv;
        }
 
        if (!mongoc_collection_update(global.collection, MONGOC_UPDATE_NONE, query, update, NULL, &error))  <-- here we fail
                error("%s\n", error.message);
        else
                rv = 1;
 
        if (query)
                bson_destroy(query);
 
        debug("%s() exiting\n", __func__);
 
        return rv;
}

So somehow we have 0 in rax and that leads to the crash (in my understanding), but I'm sure 'query' and 'update' a valid pointers which we send to mongoc_collection_update()



 Comments   
Comment by A. Jesse Jiryu Davis [ 10/Sep/16 ]

We do have this, linked from the main page with the link title "Threading":

The MongoDB C Driver is thread-unaware in the vast majority of its operations. This means it is up to the programmer to guarantee thread-safety.

However, mongoc_client_pool_t is thread-safe and is used to fetch a mongoc_client_t in a thread-safe manner. After retrieving a client from the pool, the client structure should be considered owned by the calling thread. When the thread is finished, the client should be placed back into the pool.

Perhaps this would have caught your attention more if it'd been titled "thread safety" instead of "threading"?

Comment by Andrii Guriev [ 10/Sep/16 ]

Probably there should be a notice somewhere with a big letters : "MONGOC CLIENT IS NOT SAFE FOR MULTITHREADING".

Comment by A. Jesse Jiryu Davis [ 10/Sep/16 ]

No problem. Thanks for letting us know!

Comment by Andrii Guriev [ 10/Sep/16 ]

I found that this is a multithreaded isssue , when we use the same client allocated with mongoc_client_new() from 2 separate threads. Somehow I didn't recognize it earlier. As soon as mongoc_client_t is NOT thread-safe - I guess this issue could be closed with resolution like "not a bug".
Sorry for making noise

Comment by Andrii Guriev [ 09/Sep/16 ]

Init was done like that:

        //mongo init
        mongoc_init(); //init C mongo library
        debug("[connect to mongodb] : [%s] \n", DB_PATH);
        global.client = mongoc_client_new(DB_PATH); //new client (to localhost)
        global.collection = mongoc_client_get_collection(global.client, "test", "test"); // get collection

configure was with such options:

./configure --prefix=/usr --libdir=/usr/lib --enable-debug --enable-tracing --enable-debug-symbols=yes

but it also fails if built without any debugging with just:

 ./configure --prefix=/usr --libdir=/usr/lib

Generated at Wed Feb 07 21:12:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.