[CXX-1890] Read Access violation in _mongoc_label_unknown_member_cb Created: 12/Dec/19  Updated: 25/May/20  Resolved: 25/May/20

Status: Closed
Project: C++ Driver
Component/s: None
Affects Version/s: 3.4.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Matthieu Bolt Assignee: Kevin Albertson
Resolution: Cannot Reproduce Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 10



 Description   

Non reproducable crash, see stackdump below. Most likely depending on configuration of client environment in combination with mongodb server. Please advice.

 

1005 mongoc-topology-description.c - _mongoc_label_unknown_member_cb d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-topology-description.c(1005)
178 mongoc-set.c - mongoc_set_for_each d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-set.c(178)
1045 mongoc-topology-description.c - _mongoc_topology_description_label_unknown_member d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-topology-description.c(1045)
1519 mongoc-topology-description.c - _mongoc_topology_description_update_rs_without_primary d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-topology-description.c(1519)
1957 mongoc-topology-description.c - mongoc_topology_description_handle_ismaster d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-topology-description.c(1957)
170 mongoc-topology.c -  _mongoc_topology_scanner_cb d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-topology.c(170)
488 mongoc-topology-scanner.c - _async_success d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-topology-scanner.c(488)
132 mongoc-async-cmd.c - mongoc_async_cmd_run d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-async-cmd.c(132)
161 mongoc-async.c - mongoc_async_run d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-async.c(161)
1008 mongoc-topology-scanner.c - mongoc_topology_scanner_work d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-topology-scanner.c(1008)
1292 mongoc-topology.c - _mongoc_topology_run_background d:\projects\mongo-c-driver-1.15.2\src\libmongoc\src\mongoc\mongoc-topology.c(1292)

 

 



 Comments   
Comment by Kevin Albertson [ 25/May/20 ]

Hi mmhjbolt@hotmail.com. I'm closing since there has not been any activity. If you do see this reproduce, or have additional information, feel free to reopen.

Comment by Matthieu Bolt [ 28/Apr/20 ]

Dear Kevin,

Thanks for the uptake. I have tried to reproduce the problem but didn't succeed in this (I can't reproduce the exact same situation because it only manifested on a production environment).

Our current line of thinking is that the crash was caused by a memory allocation of a static variable in bson c library that was not made by our memory allocator. Either our memory allocator improved or the order of initialization of static variables changed. If the problem reoccurs and it's not caused by a bug in our memory allocator then I will use the -DENABLE_TRACING=ON flag to provide more information.

Thanks,

Matthieu

 

Comment by Kevin Albertson [ 17/Apr/20 ]

Hello mmhjbolt@hotmail.com,

Apologies for the delay. I've inspected the code around the stack trace you provided. I don't have any theories yet, but additional information may help me reproduce the issue. I concur with Jeremy that this is unlikely to do with firewall settings.

1. How often does this happen? Does this reproduce consistently for you, or do you see this crash only occasionally?

2.

We have tested our previous version (1.9.2 of the mongo-c-driver and 3.2.0 of mong-cxx-driver) and this one also crashes (different stack trace though)


Do you have those stack traces handy? Additionally, the stack trace you provided is for the background monitoring thread. Is it possible to get the stack traces of the other threads running?

3. If possible, could you build and test with tracing enabled (pass -DENABLE_TRACING=ON to cmake) and include the output? This provides very verbose logging. That is described here: http://mongoc.org/libmongoc/current/logging.html#tracing

Thank you,
Kevin

Comment by Matthieu Bolt [ 03/Mar/20 ]

Any news about the "scheduling" or progress on this issue? Could you indicate if and when issue might be resolved?

Comment by Matthieu Bolt [ 18/Dec/19 ]

Thanks for the explanation, looking forward to the next steps from the c++ team

Comment by Jeremy Mikola [ 16/Dec/19 ]

"Topology" refers to the structure of the replica set or sharded cluster, which you've now provided, and TLS is synonymous with SSL, which you clarified is not being used.

I'll leave this issue in "Needs Triage" for now so the C++ team and look into it when they get a chance.

Comment by Matthieu Bolt [ 16/Dec/19 ]

Dear Jeremy,
thanks for the quick reply. Pleas eknow that I'm not mongo expert at all so this is the information from our mongo support team:
It's a 5 node stretched replicaset (4 data nodes 1 arbiter, 1 master, 3 slaves) no sharding
Authentications is used, no SSL.
There was no information regarding the topology and size or TLS. Could you please provide some pointer on how to get this information if this is required to solve the issue?

Best regards

Comment by Jeremy Mikola [ 13/Dec/19 ]

The mongo server is running version 3.6.14

Can you confirm if you are running a standalone mongod with no authentication or TLS? If possible, sharing your server configuration or the command line used to spawn the server would be helpful.

I'd just like to get that clarified before we get a chance to allocate time to investigate this issue and attempt reproduction. It will be helpful to know if we're dealing with a single mongod, replica set, or sharded cluster – and if it's not a single mongod we'd need to know more details about the deployment topology.

Could it be caused by firewall settings?

It's likely too early to say, but firewall settings or network connectivity issues should certainly not lead to a crash.

Comment by Matthieu Bolt [ 13/Dec/19 ]

Indeed libmongoc version 1.15.2 is used (newer is better?), the issue can be reproduced now on 1 specific live environment. We have tested our previous version (1.9.2 of the mongo-c-driver and 3.2.0 of mong-cxx-driver) and this one also crashes (different stack trace though). The mongo server is running version 3.6.14. Could it be caused by firewall settings?

Comment by Jeremy Mikola [ 12/Dec/19 ]

Most likely depending on configuration of client environment in combination with mongodb server.

Can you clarify exactly what version of the C++ driver you are using? You specified "3.4.0" in "Affects Version" on the ticket, but the stacktrace refers to libmongoc 1.15.2. Looking at the C++ driver 3.4.0 release notes, the minimum libmongoc version is 1.13 – so it'd be helpful to know if you're using C++ driver 3.4.0 with a newer version of libmongoc or not.

Additionally, please share as much detail as possible about the MongoDB server deployment (e.g. server version, topology type and size, whether auth/TLS is being used), and the state of the deployment at the time of the crash if that's possible (for example, if a replica set was undergoing a failover).

Generated at Wed Feb 07 22:04:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.