[SERVER-20622] Memory leak in TTL index job Created: 27/Apr/15 Updated: 07/Oct/15 Resolved: 24/Sep/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | TTL |
| Affects Version/s: | None |
| Fix Version/s: | 3.1.9 |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | J Rassi |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Sprint: | Quint Iteration 4, Quint Iteration 5, Quint Iteration 6, Quint Iteration 7, QuInt 8 08/28/15, Quint 9 09/18/15, QuInt A (10/12/15) | ||||
| Participants: | |||||
| Linked BF Score: | 0 | ||||
| Description |
|
076cd926ab z ASAN SSL Ubuntu 1404 64-bit DEBUG jsCore_small_oplog_rs_WT
|
| Comments |
| Comment by Githook User [ 24/Sep/15 ] | ||||
|
Author: {u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}Message: | ||||
| Comment by Andrew Morrow (Inactive) [ 24/Sep/15 ] | ||||
|
Assigning back to Jason - he has a CR already worked up for this and it is LGTMed. | ||||
| Comment by Andrew Morrow (Inactive) [ 24/Sep/15 ] | ||||
|
I have a theory on this one. The startTTLBackgroundJob function is written like this:
So, the object is definitely leaked. So, why don't we see this as a leak every time? Well, the object that it creates, TTLMonitor, has a BackgroundJob, which internally is a thread, which calls back into the TTLMonitor. That means that the thread has a pointer to the TTLMonitor. If you read up on the rooting model for reachability detection in LSAN (see https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizerDesignDocument) you will notice that it actually looks into thread stacks for pointers to heap blocks. So as long as the thread is still running when we hit __lsan_do_leak_check, then the TTLMonitor object will be reachable from its thread stack, so no leak. But what if the TTLMonitor thread happens to notice that inShutdown has been toggled to true before we reach __lsan_do_leak_check? In that case the thread will terminate, and now LSAN will see the TTLMonitor object as unreachable, and report a leak. I think the easiest solution is probably to hoist the pointer for the TTL monitor out of the function, so that it has global scope. That way, LSAN will always see it as reachable via the scan of globals. | ||||
| Comment by Robert Guo (Inactive) [ 21/Sep/15 ] | ||||
|
one more during my patch build: https://logkeeper.mongodb.org/build/55fd05afbe07c4135f44320d/test/55fd124cbe07c4135f44642e | ||||
| Comment by Max Hirschhorn [ 06/Aug/15 ] | ||||
|
Happened during my patch build. See the logs: https://logkeeper.mongodb.org/build/55c37bdebe07c47abf0c9812/test/55c3801390413011a20ce431 | ||||
| Comment by Matt Dannenberg [ 24/Jun/15 ] | ||||
| Comment by J Rassi [ 29/Apr/15 ] | ||||
|
To whoever the active build baron is at the time: please post a comment on this ticket if you see a recurrence of this issue. See further discussion at the codereview link above. | ||||
| Comment by Spencer Brody (Inactive) [ 27/Apr/15 ] | ||||
|
Also happened on non-WT suite on the same commit: https://evergreen.mongodb.com/task/mongodb_mongo_master_ubuntu1404_debug_asan_076cd926ab476f872afdd89a0e5e7e733d26c3ae_15_04_27_06_57_05_jsCore_small_oplog_rs_ubuntu1404_debug_asan |