[SERVER-51466] Investigate support for building a multikey index from an existing multikey index Created: 09/Oct/20  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Louis Williams Assignee: Backlog - Storage Execution Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-36202 when creating new index, use existing... Closed
Assigned Teams:
Storage Execution
Participants:

 Description   

We should be able to copy the multikey state from the existing multikey index without having to track any inserts into the bulk builder.

For example, say we have an existing multikey index on {a: 1, b: 1} and the following keys (a, b, RecordId) generated from a document { a: [1, 2, 3], b: 1 }:

  • (1, 1, RID(1))
  • (2, 1, RID(1))
  • (3, 1, RID(1))

If we use this index to build a new index on {a: 1}, we can directly insert these keys into the new index:

  • (1, RID(1))
  • (2, RID(1))
  • (3, RID(1))

Since the multikey keys are in the index already, and we're building an index on 'a' which is the key that makes the index multikey, the new index should just be able to copy the existing multikeypaths, truncating everything after "a".



 Comments   
Comment by Louis Williams [ 12/Oct/20 ]

This would mean we may build a new index and declare it multikey even if it is not.

Yes, this is the flaw with the current design. However, because of the way we implement multikey keys and metadata, we would either need to look at the document from the collection or somehow keep track of the number of keys seen that map to each RecordId. Both solutions seem expensive.

I'd be easily convinced if we read and copy over the state at the end of the index build with a collection MODE_X lock (which I suspect we already get for the ready: true write).

This was essentially the idea. I was thinking that we don't even need a MODE_X lock. The index build interceptor will handle at least one concurrency problem where a multikey document is inserted during the index build.

Comment by Daniel Gottlieb (Inactive) [ 12/Oct/20 ]

A consideration: Multikey is only ever flipped from false->true. This would mean we may build a new index and declare it multikey even if it is not. That said, I don't expect it to be common for a user to insert some number of multikey documents and then at a later point, wipe them out.

And a question: When/how will the multikey state be copied? I'd be easily convinced if we read and copy over the state at the end of the index build with a collection MODE_X lock (which I suspect we already get for the ready: true write). Is there a smarter idea this ticket is striving for?

Generated at Thu Feb 08 05:25:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.