[SERVER-51466] Investigate support for building a multikey index from an existing multikey index Created: 09/Oct/20 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Storage Execution
|
||||||||
| Participants: | |||||||||
| Description |
|
We should be able to copy the multikey state from the existing multikey index without having to track any inserts into the bulk builder. For example, say we have an existing multikey index on {a: 1, b: 1} and the following keys (a, b, RecordId) generated from a document { a: [1, 2, 3], b: 1 }:
If we use this index to build a new index on {a: 1}, we can directly insert these keys into the new index:
Since the multikey keys are in the index already, and we're building an index on 'a' which is the key that makes the index multikey, the new index should just be able to copy the existing multikeypaths, truncating everything after "a". |
| Comments |
| Comment by Louis Williams [ 12/Oct/20 ] |
Yes, this is the flaw with the current design. However, because of the way we implement multikey keys and metadata, we would either need to look at the document from the collection or somehow keep track of the number of keys seen that map to each RecordId. Both solutions seem expensive.
This was essentially the idea. I was thinking that we don't even need a MODE_X lock. The index build interceptor will handle at least one concurrency problem where a multikey document is inserted during the index build. |
| Comment by Daniel Gottlieb (Inactive) [ 12/Oct/20 ] |
|
A consideration: Multikey is only ever flipped from false->true. This would mean we may build a new index and declare it multikey even if it is not. That said, I don't expect it to be common for a user to insert some number of multikey documents and then at a later point, wipe them out. And a question: When/how will the multikey state be copied? I'd be easily convinced if we read and copy over the state at the end of the index build with a collection MODE_X lock (which I suspect we already get for the ready: true write). Is there a smarter idea this ticket is striving for? |