[SERVER-27261] All secondary indexes are compressed but not primary key(_id) Created: 02/Dec/16 Updated: 08/Feb/23 Resolved: 03/Jan/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.2.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | 아나 하리 | Assignee: | Geert Bosch |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Steps To Reproduce: | 1) start mongodb with below WiredTiger options |
| Sprint: | Storage 2017-01-23 |
| Participants: |
| Description |
|
I've heard that index compression is not so usefult, so index block is not compressed. But actually this is not the case, Is this MongoDB expected or not ? |
| Comments |
| Comment by Geert Bosch [ 03/Jan/17 ] |
|
While MongoDB doesn't use use compression for indexes on WiredTiger, we don't store index keys verbatim. In particular, we use a KeyString format, rather than BSON, that will ensure that all keys are binary comparable. This also will do things like flipping all bits, depending on specified ordering, and much more comprehensive encoding for numeric types. For strings with non-simple locales, we use the ICU library to recode strings so they are binary comparable using the locale-specific rules and desired strength (case-insensitive or not, for example). Finally, in some cases prefix and/or suffix compression may be applied: this really isn't "compression" like gzip or snappy, but just not storing the repeated common prefix for a list of keys. All these methods generally result in both a very significant speed improvement, as well as reduced storage and cache pressure. In short, you cannot in general expect to be able to see your strings in literal form in your *.wt files, even with compression turned off. |