[SERVER-20622] Memory leak in TTL index job Created: 27/Apr/15  Updated: 07/Oct/15  Resolved: 24/Sep/15

Status: Closed
Project: Core Server
Component/s: TTL
Affects Version/s: None
Fix Version/s: 3.1.9

Type: Bug Priority: Minor - P4
Reporter: Spencer Brody (Inactive) Assignee: J Rassi
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Quint Iteration 4, Quint Iteration 5, Quint Iteration 6, Quint Iteration 7, QuInt 8 08/28/15, Quint 9 09/18/15, QuInt A (10/12/15)
Participants:
Linked BF Score: 0

 Description   

076cd926ab z ASAN SSL Ubuntu 1404 64-bit DEBUG jsCore_small_oplog_rs_WT

failing task
logs

	
==10062==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x977269 in operator new(unsigned long) (/data/mci/shell/src/mongod+0x977269)
    #1 0x20d95b0 in mongo::startTTLBackgroundJob() /data/mci/shell/src/src/mongo/db/ttl.cpp:288
    #2 0x993a86 in mongo::_initAndListen(int) /data/mci/shell/src/src/mongo/db/db.cpp:589
    #3 0x98d09a in mongo::initAndListen(int) /data/mci/shell/src/src/mongo/db/db.cpp:606
    #4 0x9a00c2 in mongoDbMain(int, char**, char**) /data/mci/shell/src/src/mongo/db/db.cpp:857
    #5 0x9a00c2 in main /data/mci/shell/src/src/mongo/db/db.cpp:655
    #6 0x7ff927382ec4 (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
Indirect leak of 136 byte(s) in 1 object(s) allocated from:
    #0 0x977269 in operator new(unsigned long) (/data/mci/shell/src/mongod+0x977269)
    #1 0x26e4b86 in mongo::BackgroundJob::BackgroundJob(bool) /data/mci/shell/src/src/mongo/util/background.cpp:145
    #2 0x20d95bd in mongo::startTTLBackgroundJob() /data/mci/shell/src/src/mongo/db/ttl.cpp:288
    #3 0x993a86 in mongo::_initAndListen(int) /data/mci/shell/src/src/mongo/db/db.cpp:589
    #4 0x98d09a in mongo::initAndListen(int) /data/mci/shell/src/src/mongo/db/db.cpp:606
    #5 0x9a00c2 in mongoDbMain(int, char**, char**) /data/mci/shell/src/src/mongo/db/db.cpp:857
    #6 0x9a00c2 in main /data/mci/shell/src/src/mongo/db/db.cpp:655
    #7 0x7ff927382ec4 (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
SUMMARY: AddressSanitizer: 160 byte(s) leaked in 2 allocation(s).



 Comments   
Comment by Githook User [ 24/Sep/15 ]

Author:

{u'username': u'jrassi', u'name': u'Jason Rassi', u'email': u'rassi@10gen.com'}

Message: SERVER-20622 Global TTLMonitor obj should be reachable at process exit
Branch: master
https://github.com/mongodb/mongo/commit/ddb8c9ba180e546ead966d0beaeb684e251045d1

Comment by Andrew Morrow (Inactive) [ 24/Sep/15 ]

Assigning back to Jason - he has a CR already worked up for this and it is LGTMed.

Comment by Andrew Morrow (Inactive) [ 24/Sep/15 ]

I have a theory on this one. The startTTLBackgroundJob function is written like this:

void startTTLBackgroundJob() {
    TTLMonitor* ttl = new TTLMonitor();
    ttl->go();
}

So, the object is definitely leaked. So, why don't we see this as a leak every time? Well, the object that it creates, TTLMonitor, has a BackgroundJob, which internally is a thread, which calls back into the TTLMonitor. That means that the thread has a pointer to the TTLMonitor.

If you read up on the rooting model for reachability detection in LSAN (see https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizerDesignDocument) you will notice that it actually looks into thread stacks for pointers to heap blocks. So as long as the thread is still running when we hit __lsan_do_leak_check, then the TTLMonitor object will be reachable from its thread stack, so no leak.

But what if the TTLMonitor thread happens to notice that inShutdown has been toggled to true before we reach __lsan_do_leak_check? In that case the thread will terminate, and now LSAN will see the TTLMonitor object as unreachable, and report a leak.

I think the easiest solution is probably to hoist the pointer for the TTL monitor out of the function, so that it has global scope. That way, LSAN will always see it as reachable via the scan of globals.

Comment by Robert Guo (Inactive) [ 21/Sep/15 ]

one more during my patch build: https://logkeeper.mongodb.org/build/55fd05afbe07c4135f44320d/test/55fd124cbe07c4135f44642e

Comment by Max Hirschhorn [ 06/Aug/15 ]

Happened during my patch build. See the logs: https://logkeeper.mongodb.org/build/55c37bdebe07c47abf0c9812/test/55c3801390413011a20ce431

Comment by Matt Dannenberg [ 24/Jun/15 ]

Still happening:
https://evergreen.mongodb.com/task/mongodb_mongo_master_ubuntu1410_debug_asan_jsCore_small_oplog_rs_cb23019011883f3c5f0ce0876248e80f05de4581_15_06_24_01_00_12

Comment by J Rassi [ 29/Apr/15 ]

To whoever the active build baron is at the time: please post a comment on this ticket if you see a recurrence of this issue. See further discussion at the codereview link above.

Comment by Spencer Brody (Inactive) [ 27/Apr/15 ]

Also happened on non-WT suite on the same commit: https://evergreen.mongodb.com/task/mongodb_mongo_master_ubuntu1404_debug_asan_076cd926ab476f872afdd89a0e5e7e733d26c3ae_15_04_27_06_57_05_jsCore_small_oplog_rs_ubuntu1404_debug_asan

Generated at Thu Feb 08 03:54:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.