[SERVER-22375] Investigate maintaining a cache of icu::Collators Created: 29/Jan/16  Updated: 06/Dec/22  Resolved: 14/Jun/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: David Storch Assignee: Backlog - Query Team (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query
Sprint: Query 10 (02/22/16)
Participants:

 Description   

CollatorFactoryICU::makeFromBSON() opens a new icu::Collator each time it is called. Instead we should consider keeping a cache of open icu::Collator pointers. These may be raw pointers that exist for the lifetime of the mongod instance, or they may be shared_ptrs if we intend to evict icu::Collators from the cache when it gets too large.

The cache should be a map from the BSON spec to the icu::Collator pointer. The reason for using the BSON instead of the CollationSpec as the key is because it is necessary to open an icu::Collator in order to create the CollationSpec. It is of course possible to create different BSON specs that produce equivalent icu::Collators, so we will have some duplicate icu::Collators in our cache.

We can have a single cache of icu::Collators used by all threads. It is not necessary to clone icu::Collators for use in separate threads. That is because after we insert an icu::Collator into the cache, only const methods will be called on it, such as compareUTF8() and getCollationKey().

The pseudocode for CollatorFactoryICU::makeFromBSON() will look like:
lock cache
if (BSON exists in cache)
return collator
else
create collator
insert (BSON, collator) into cache
return collator
unlock cache

Performance considerations are the cost of opening icu::Collators vs. the cost of locking the icu::Collator cache. The current implementation is simpler and thus preferable if performance differences are minor.



 Comments   
Comment by David Storch [ 14/Jun/17 ]

This work is no longer planned.

Generated at Thu Feb 08 04:00:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.