[CDRIVER-3879] Pointer being freed not allocated in mongoc_cursor_destroy Created: 28/Jan/21 Updated: 27/Oct/23 Resolved: 10/Feb/21 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | libmongoc |
| Affects Version/s: | 1.17.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Thijs Cadier | Assignee: | Clyde Bazile III (Inactive) |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Tried on MacOs Catalina. |
||
| Description |
|
I'm upgrading the Mongoc version used in the Mongo Rust driver: https://github.com/thijsc/mongo-rust-driver/pull/59. It seems something changed. When running the cleanup logic in the correct order, as enforced by Rusts ownership system there is now a double free:
This code worked correctly in earlier C driver versions. I don't see any notice about a change in the docs. Is this a bug? |
| Comments |
| Comment by Thijs Cadier [ 15/Feb/21 ] | ||||||||||||||||||||||||||||||||||
|
Thanks for the help! That would have been tricky to figure out. Fix based on this comment is here. | ||||||||||||||||||||||||||||||||||
| Comment by Clyde Bazile III (Inactive) [ 10/Feb/21 ] | ||||||||||||||||||||||||||||||||||
|
thijs@appsignal.com, closing this ticket but feel free to re-open if this problem persists. | ||||||||||||||||||||||||||||||||||
| Comment by Clyde Bazile III (Inactive) [ 10/Feb/21 ] | ||||||||||||||||||||||||||||||||||
|
I was able to reproduce this issue in the C driver by freeing the BSON object passed to mongoc_cursor_next. That is, result in the example above.
The BSON object passed to mongoc_cursor_next is ephemeral and only good until the next call to mongoc_cursor_next or the cursor is destroyed with mongoc_cursor_destroy. In other words, the lifecycle of the BSON object is handled by the cursor. This is why this function takes a const bson_t*. In cursor.rs:106, the BSON object passed to mongoc_cursor_next is used to construct a Bsonc object (here) and consequently destroyed at the end of the function by Bsonc::Drop. Commenting out the code within Bsonc::Drop (making it a no-op function) fixes this test since the ephemeral BSON object is no longer being freed twice. I'm not sure how this should be handled Rust, but ultimately you should avoid freeing the BSON object passed to mongoc_cursor_next here and allow mongoc_cursor_t to manage its lifetime. | ||||||||||||||||||||||||||||||||||
| Comment by Thijs Cadier [ 10/Feb/21 ] | ||||||||||||||||||||||||||||||||||
|
I do push the client back on the pool here. The Drop trait in Rust ensures that this cleanup logic always runs at the right moment. I'm happy to provide a stack trace, but it's much easier to run the test yourself. Run this with a recent Rust stable installed and a Mongo server running on localhost:
On the main branch this test is green, on the mongoc_1_17 branch the double-free happens. | ||||||||||||||||||||||||||||||||||
| Comment by Clyde Bazile III (Inactive) [ 08/Feb/21 ] | ||||||||||||||||||||||||||||||||||
|
thijs@appsignal.com I was not able to reproduce this issue locally using a similar example (below)
However, it isn't obvious from your example (here) if the client is being returned to the pool (as required, see this and my example above). That could certainly cause issues, but it's difficult to see where the issue lies without a complete stack trace. | ||||||||||||||||||||||||||||||||||
| Comment by Thijs Cadier [ 01/Feb/21 ] | ||||||||||||||||||||||||||||||||||
|
So the crash happens here for example. This triggers the cleanup logic in the drop of the cursor here. The order of cleanup is ensured by some lifetime relations that are present, starting here. So a client's destroy cannot be executed before a cursor is. Is there a scenario in which cleanup would happen automatically now? | ||||||||||||||||||||||||||||||||||
| Comment by Clyde Bazile III (Inactive) [ 29/Jan/21 ] | ||||||||||||||||||||||||||||||||||
|
Hey thijs@appsignal.com ! I'm currently investigating this issue. To answer your question, pointing us to the problematic test could help us better understand what Rust defines as the "correct order" of the cleanup logic and possibly point out anything that's obviously fishy. Additionally, a stack trace or debugging info could help us see what was happening under our driver's hood just before the crash. Both are helpful, but the latter (the stack trace) more so. | ||||||||||||||||||||||||||||||||||
| Comment by Thijs Cadier [ 28/Jan/21 ] | ||||||||||||||||||||||||||||||||||
|
Thanks for your reply! I'm indeed going from 1.8.2 all the way up to the latest version. I know exactly where this call is made in a test. Is that helpful? You could run the tests of this project. | ||||||||||||||||||||||||||||||||||
| Comment by Kevin Albertson [ 28/Jan/21 ] | ||||||||||||||||||||||||||||||||||
|
Thank you for the report thijs@appsignal.com! We are unaware of an existing double free in mongoc_cursor_destroy, though we will investigate. Correct me if I am wrong, but from the PR, it looks like this is upgrading from 1.8.2. If there is a stack trace available, or debug info, that would be helpful in our investigation. Sincerely, |