[SERVER-31462] convertToCapped + renameCollection may cause a segfault Created: 09/Oct/17 Updated: 30/Oct/23 Resolved: 01/Nov/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 3.5.13 |
| Fix Version/s: | 3.2.19, 3.4.12, 3.6.0-rc3 |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Robert Guo (Inactive) | Assignee: | Xiangyu Yao (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | rbfz | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Requested: |
v3.4, v3.2
|
||||||||||||
| Sprint: | Storage 2017-11-13 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
A renameCollection command renaming a collection to itself following a convertToCapped can cause a segfault. ( |
| Comments |
| Comment by Githook User [ 30/Jan/18 ] | ||||||||
|
Author: {'email': 'xiangyu.yao@mongodb.com', 'name': 'Xiangyu Yao', 'username': 'xy24'}Message: The wrong order leads to the rollback of CollectionImpl happening after (cherry picked from commit f9e38f099a892964a09d4a80aafd8edfef21594d) | ||||||||
| Comment by Githook User [ 30/Jan/18 ] | ||||||||
|
Author: {'email': 'xiangyu.yao@mongodb.com', 'name': 'Xiangyu Yao', 'username': 'xy24'}Message: The wrong order leads to the rollback of CollectionImpl happening after (cherry picked from commit f9e38f099a892964a09d4a80aafd8edfef21594d) | ||||||||
| Comment by William Schultz (Inactive) [ 15/Nov/17 ] | ||||||||
|
robert.guo Are there things that need to be re-enabled in the rollback fuzzer since this ticket has been resolved? | ||||||||
| Comment by Githook User [ 01/Nov/17 ] | ||||||||
|
Author: {'email': 'xiangyu.yao@mongodb.com', 'name': 'Xiangyu Yao', 'username': 'xy24'}Message: The wrong order leads to the rollback of CollectionImpl happening after | ||||||||
| Comment by Xiangyu Yao (Inactive) [ 01/Nov/17 ] | ||||||||
|
Here is why it destructs the KVCCE and the CollectionImpl in the wrong order: We register in RecoveryUnit DatabaseImpl::AddCollectionChange before KVCCE::AddCollectionChange, so when the abortion happens (in this case, because we first drop the target which is "test.test" itself so the renaming would abort), KVCCE::AddCollectionChange::rollback() will be triggered ahead of DatabaseImpl::AddCollectionChange::rollback(), thus making the destruction of KVCCE precede that of CollectionImpl. That only happens to renameCollection() rather than createCollection() because the ordering of registrations in createCollection() is fixed by commit for SERVER-26685 | ||||||||
| Comment by Eric Milkie [ 27/Oct/17 ] | ||||||||
|
It appears that the catalog code is assuming that the RecordStore's lifetime will outlast its corresponding Collection's lifetime. Which translates into the KVCCE outlasting the Collection. It sounds like whatever code is doing the dropping of that collection, it is destructing the KVCCE and the Collection in the wrong order; I would look at that code to see what's going on there. In a normal drop collection codepath, this isn't a problem, so it must be related to either the convertToCapped or renameCollection code paths when they drop their source collections, or when renameCollection drops its target collection. | ||||||||
| Comment by Xiangyu Yao (Inactive) [ 27/Oct/17 ] | ||||||||
|
Conclusion: Detail:
And there is a "stupid" pointer in CollectionImpl:
They refer to the same thing! Here is what happened when I reproduced the bug: The system dropped a collection called "test.system.drop.xxxxxxxx.test". And this made KVCollectionCatalogEntry get destructed, so the smart pointer freed the resource.
Line A: This would try to dereference the stupid pointer _recordStore which points to garbage. Solutions I can think of:
|