Details
Description
We uncovered a memory leak from reconciliation, with the following signature:
==6758==ERROR: LeakSanitizer: detected memory leaks
|
Direct leak of 9374 byte(s) in 7 object(s) allocated from:
|
#0 0x4984bd in malloc
|
#1 0x7ff1c5ae53e2 in __wt_malloc rc/os_common/os_alloc.c:91:14
|
#2 0x7ff1c5ae5f51 in __wt_memdup src/os_common/os_alloc.c:249:5
|
#3 0x7ff1c5b43229 in __rec_split_write src/reconcile/rec_write.c:2204:9
|
#4 0x7ff1c5b44962 in __wt_rec_split_finish src/reconcile/rec_write.c:1706:13
|
#5 0x7ff1c5b299de in __wt_rec_row_leaf src/reconcile/rec_row.c:1002:11
|
#6 0x7ff1c5b3fb73 in __reconcile src/reconcile/rec_write.c:258:9
|
#7 0x7ff1c5b3eeb2 in __wt_reconcile src/reconcile/rec_write.c:98:11
|
#8 0x7ff1c5a596f4 in __evict_review src/evict/evict_page.c:731:9
|
#9 0x7ff1c5a573e1 in __wt_evict src/evict/evict_page.c:168:5
|
#10 0x7ff1c5a47b9d in __evict_page src/evict/evict_lru.c:2334:5
|
#11 0x7ff1c5a4431a in __evict_lru_pages src/evict/evict_lru.c:1150:20
|
The memory leak is associated with a new failpoint that was added in WT-9252 and disabled in WT-9711. The failpoint is:
--- a/src/evict/evict_page.c
|
+++ b/src/evict/evict_page.c
|
@@ -760,10 +760,17 @@ __evict_review(WT_SESSION_IMPL *session, WT_REF *ref, uint32_t evict_flags, bool
|
!__wt_page_is_modified(page) || LF_ISSET(WT_REC_HS | WT_REC_IN_MEMORY) ||
|
WT_IS_METADATA(btree->dhandle));
|
|
/* Fail 0.1% of the time. */
|
if (!closing &&
|
__wt_failpoint(session, WT_TIMING_STRESS_FAILPOINT_EVICTION_FAIL_AFTER_RECONCILIATION, 10))
|
return (EBUSY);
|
|
return (0);
|
}
|
The purview of this ticket is to:
- Understand and fix the root cause for the memory leak
- Restructure the code in evict_page.c to be more obvious, and introduced documented constraints if there are times when an eviction is not allowed to fail (hopefully there are none, but the cleanup is worthwhile regardless).
- Consider updating how memory is tracked across reconciliations to ensure cleanup is obviously correct.