-
Type: Task
-
Resolution: Won't Fix
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Querying
-
Labels:None
-
Query
-
Query 10 (02/22/16)
CollatorFactoryICU::makeFromBSON() opens a new icu::Collator each time it is called. Instead we should consider keeping a cache of open icu::Collator pointers. These may be raw pointers that exist for the lifetime of the mongod instance, or they may be shared_ptrs if we intend to evict icu::Collators from the cache when it gets too large.
The cache should be a map from the BSON spec to the icu::Collator pointer. The reason for using the BSON instead of the CollationSpec as the key is because it is necessary to open an icu::Collator in order to create the CollationSpec. It is of course possible to create different BSON specs that produce equivalent icu::Collators, so we will have some duplicate icu::Collators in our cache.
We can have a single cache of icu::Collators used by all threads. It is not necessary to clone icu::Collators for use in separate threads. That is because after we insert an icu::Collator into the cache, only const methods will be called on it, such as compareUTF8() and getCollationKey().
The pseudocode for CollatorFactoryICU::makeFromBSON() will look like:
lock cache
if (BSON exists in cache)
return collator
else
create collator
insert (BSON, collator) into cache
return collator
unlock cache
Performance considerations are the cost of opening icu::Collators vs. the cost of locking the icu::Collator cache. The current implementation is simpler and thus preferable if performance differences are minor.