[SERVER-62974] Investigate use of persistent maps in CollectionCatalog Created: 25/Jan/22  Updated: 23/Mar/23  Resolved: 23/Mar/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Dan Larkin-York Assignee: Backlog - Storage Execution Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-68674 Vendor an immutable/persistent data s... Closed
Assigned Teams:
Storage Execution
Backwards Compatibility: Fully Compatible
Participants:

 Description   

We create a copy of the CollectionCatalog when we need to modify it. This can be somewhat expensive when we have many collections. In some cases where we have a global exclusive lock, such as at startup, we can mitigate this by using batched writes. During steady state though, we can't use this batch processing. For some workloads, like creating many collections in sequence, this can have quadratic cost.

We could potentially reduce the cost of the CollectionCatalog copy so that it isn't linear in the number of collections by using persistent data structures. We already have a persistent map-like structure in our codebase--the radix tree used for EphemeralForTest. This could be used to, at the very least, do a quick POC and see if the idea warrants further work.



 Comments   
Comment by Louis Williams [ 01/Feb/22 ]

We should alternatively consider partitioning the catalog.

Generated at Thu Feb 08 05:56:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.