[SERVER-57438] XOR based compression for doubles Created: 04/Jun/21 Updated: 06/Dec/22 Resolved: 02/Aug/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Henrik Edin | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Storage Execution
|
| Sprint: | Execution Team 2021-06-14 |
| Participants: |
| Description |
|
Implement an encoder and decoder for the XOR based compression for double-precision floating-points as described in the "Gorilla: A Fast, Scalable, In-Memory Time Series Database" paper by Pelkonen et al. Doubles are XOR'ed with previous value and number of leading and trailing zeroes in the result are counted. The remaining "value" bits are called meaningful bits. A bit stream is encoded with the following control bits followed by data: '0': XOR yieleded zero (value is same as previous), nothing more to store. '10': If number of leading and trailing zeros are at least as many as previous '11' state, we reuse these bits and store just the meaningful bits. '11': count of leading zeros are stored in the next 5 bits, then number of meaningful bits are stored in the next 6 bits and last the actual meaningful bits are stored. |