-
Type: Task
-
Resolution: Done
-
Affects Version/s: None
-
Component/s: None
-
None
Add support for tracking which spinlocks block other spinlocks and displaying it as part of the logged statistics.
michaelcahill – I've been thinking of ways to try and get a handle on what threads are blocking other threads, and why, and I came up with this.
I added an array to each spinlock structure. If an attempt to acquire a spinlock blocks, we increment a slot in the array, tracking the code path that was holding the spinlock we couldn't acquire. Then, I tied it into the statistics-logging stuff, so we can track it over time.
It's not too invasive to the code, the big downside is there's an additional memory flush as part of acquiring a spinlock, which slows things down. If you decide this is worth keeping around, we should make it separately configurable from statistics logging, we shouldn't be doing this unless explicitly requested by the application, it's more for us than for anybody else.
Here's the the output looks like, from a wtperf run with a slightly modified update-lsm.wtperf configuration. The code in __evict_lru() at line 634 blocks itself periodically in this run
619 static int 620 __evict_lru(WT_SESSION_IMPL *session, int clean) 621 { 622 WT_CACHE *cache; 623 WT_DECL_RET; 624 WT_EVICT_ENTRY *evict; 625 uint64_t cutoff; 626 uint32_t i, candidates; 627 628 cache = S2C(session)->cache; 629 630 /* Get some more pages to consider for eviction. */ 631 WT_RET(__evict_walk(session, &candidates, clean)); 632 633 /* Sort the list into LRU order and restart. */ 634 __wt_spin_lock(session, &cache->evict_lock);
Or, here's the pattern for where inserts block other inserts:
I looked around for a tool that would give me this information, but I couldn't find anything, maybe there's something out there I didn't find? And, of course, I'm absolutely prepared to have you hate this. Heck, I don't like it much, myself.