[SERVER-64925] Use block compression for secondary indexes on clustered collections Created: 25/Mar/22 Updated: 26/Oct/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | clustered_collections, former-storex-namer | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Storage Execution
|
||||||||
| Sprint: | Execution Team 2022-10-17 | ||||||||
| Participants: | |||||||||
| Description |
|
We don't use WiredTiger block compression for secondary indexes on non-clustered collections because we already use prefix compression and the stored RecordIds are pretty compact. Clustered collections have larger RecordIds which take up more space, but may compress better. We should evaluate using block compression for secondary indexes on clustered collections to reduce storage size. |
| Comments |
| Comment by Connie Chen [ 11/Nov/22 ] |
|
michael.gargiulo@mongodb.com - taking this out of the desired bucket and throwing it into the backlog. Let us know if we should reconsider |
| Comment by Louis Williams [ 04/Oct/22 ] |
|
matthew.saltz@mongodb.com, the existing workload only creates one secondary index. We could potentially create a new workload with more indexes, but I would start by looking at that workload first. There is a version of the workload that uses larger RecordIds, which would be an interesting point of comparison between that and the one with smaller RecordIds. Also, I Consider these other challenges
So no, I don't think we necessarily need a new workload. Some local testing could answer the questions that we're after. |
| Comment by Matthew Saltz (Inactive) [ 03/Oct/22 ] |
|
louis.williams@mongodb.com For performance testing, do you think existing sys-perf workloads will be sufficient or will we need to write targeted tests for this? |
| Comment by Connie Chen [ 29/Mar/22 ] |
|
We'll want to evaluate the performance tradeoffs before commit |