When profiling skiplists, I discovered that reading from variable-length column stores is over 100x slower than row stores for the TokyoCabinet workload:
$ cd bench/tcbench
$ ./wttest write -bulk file:cabinet.wt_b 1000000 > /dev/null
$ time ./wttest read file:cabinet.wt_b 1000000 > /dev/null
real 0m4.922s
user 0m8.222s
sys 0m1.465s
$ ./wttest vlcswrite -bulk file:cabinet.vlwt_b 1000000 > /dev/null
$ time ./wttest vlcsread file:cabinet.vlwt_b 1000000 > /dev/null
real 26m51.004s
user 35m31.747s
sys 6m25.888s
Profiling shows that is because we do a full scan of the leaf page on every search:
One way to fix this would be to use the same trick we use for row leaf pages: instead of maintaining an array of offsets to cells, keep an array of pointers, some on-page and some off page. Keep the record number in expanded cells and expand the binary search points.
vlcs-read-profile.png (49.8 kB) Michael Cahill, 08/04/2011 12:07 am