[SERVER-18914] Use column store instead of row store for data table Created: 10/Jun/15 Updated: 23/Oct/15 Resolved: 23/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Daniel Pasette (Inactive) | Assignee: | David Hows |
| Resolution: | Won't Fix | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Participants: |
| Description |
|
There is a potential performance win to change the data tables (as opposed to the index or metadata/catalog tables) to use column store format instead of row store (see: http://source.wiredtiger.com/develop/schema.html#schema_format_types). This is because each document is stored with an internal integer id and normal access patterns can take advantage of columnar lookup. This would require a great deal of testing to prove out, but we've seen up to 30% perf improvements with some read only workloads. |
| Comments |
| Comment by Githook User [ 16/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'markbenvenuto', u'name': u'Mark Benvenuto', u'email': u'mark.benvenuto@mongodb.com'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Hows [ 16/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
As none of the recent attempts to reproduce have succeeded marking as "Gone Away" | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Hows [ 16/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Ran Vanilla + x86intrin Results are:
I agree with Martin, and think this can be closed as Gone Away. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Martin Bligh [ 15/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I can't reproduce the original perf change on either 3.1.8 or 3.1.4 ... not sure what happened here. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Mark Benvenuto [ 15/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
keith.bostic We are setting HAVE_X86INTRIN_H in the Windows MongoDB build. We are not setting it in any other build at the moment. I will fix our various platform builds. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Martin Bligh [ 15/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think the explain was only there to force it to iterate over the result, not actually interested in what it took. We later decided itcount() was better.
After change:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Keith Bostic (Inactive) [ 15/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
david.hows, that's 10%+ on explained queries; how interesting are they? HAVE_X86INTRIN_H won't affect key_format=r, it's row-store only; it might be worth re-running with WiredTiger-Vanilla + HAVE_X86INTRIN_H. The only simplification I can think of for the row-store search in this case would be to get rid of the loop setup and use a switch statement (because we know there's a maximum number of 9 bytes in the row-store key_format=q encoding). If that's faster, we could configure it for row-store with key_format set to Q, q or r. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Keith Bostic (Inactive) [ 15/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
michael.cahill, mark.benvenuto: shouldn't we be setting HAVE_X86INTRIN_H in MongoDB builds on Linux and Windows? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Hows [ 15/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Ran this against e61e8a9cbd3c5c1e5a46fc74f4b5ab5ce879c115 and couldn't find a difference. MMAPv1
WiredTiger - Vanilla
I made the change in key format anyway and got the following results, which looks like there is a small perf improvement.
Added X86INTRIN headers into the compile and that didn't seem to have any impact (I couldn't find where we set the variable in the existing Scons environment).
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Cahill (Inactive) [ 15/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
One more point: also check if HAVE_X86INTRIN_H is set in MongoDB builds on Linux (if it is, I can't see where). Try turning it on (or off if I'm wrong about it not being set) with the existing __wt_lex_compare to see what impact it has. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Cahill (Inactive) [ 15/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
daveh86, can you please look at the workload in Then take a look at changing key_format=r for record stores to see what performance is like – can you see an improvement? Now the hard part: can we get some of that improvement without changing the on-disk format? In particular, can you use perf and/or Zoom to figure out where time is going in row-store lookups, then investigate whether any of it can be shaved off? One idea I think is worth trying is replacing __wt_lex_compare and __wt_lex_compare_skip with really simple implementations – does that make any difference? Maybe we could selectively choose between a simple implementation for some key formats and the more general version we currently have, if you see the simpler version running faster on this workload. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Martin Bligh [ 15/Jun/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I believe current state of this is deciding whether to use clustered indices or this. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Keith Bostic (Inactive) [ 10/Jun/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think it's pretty interesting. Column-store doesn't store the (packed) int64 key into the physical file, of course, so there's additional in-memory and on-disk savings from using column-store. Column-store still has to do a binary search of internal pages, but the "key" comparison is generally much less expensive than the byte strings of row-store. Column-store does support a leaf_value_max setting larger than the page size; it's probably only lightly tested, but there's no reason that code should differ significantly from the corresponding row-store code. I'm assuming these are variable-length values, with no repeated values; we might want to make it possible to configure RLE compression off, if it's never going to fire. Row-store has been tuned/tested far more heavily than column-store, so I'd probably give any column-store release a little extra time for a good pounding. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Alexander Gorrod [ 10/Jun/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
keith.bostic What do you think of this proposal? The idea is this: MongoDB with WiredTiger currently maintains the data table with an internally managed "RecordId" key that is using q (int64) as the data type. MongoDB also maintains indexes as separate row store tables. Those row store tables contain the index data and the RecordId as a reference into the data collection. All lookups in MongoDB are done through an index. The proposal is that we could switch the data table from being a row store q to a column store r. The motivation is that we've seen cases where the cost of doing a binary search on the row store page forms a significant portion of time in a query (see One piece of functionality I'm not sure is implemented in our column store implementation is setting a leaf_value_max larger than the page size. |