[CSHARP-2450] Performance: Reduced lock contention in BsonSerializer.LookupActualType Created: 11/Dec/18 Updated: 28/Oct/23 Resolved: 30/Mar/21 |
|
| Status: | Closed |
| Project: | C# Driver |
| Component/s: | Performance, Serialization |
| Affects Version/s: | 2.7.2 |
| Fix Version/s: | 2.12.2, 2.13.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Daniel Hegener | Assignee: | James Kovacs |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Description |
|
Performance profiling in our multithreaded environment seems to indicate that BsonSerializer.LookupActualType is on the hot path during deserialization. This is caused by the current reader-writer lock implementation. There is, however, an opportunity to reduce lock contention using a non-generic Hashtable which is safe for concurrent reads as per https://docs.microsoft.com/en-us/dotnet/api/system.collections.hashtable?redirectedfrom=MSDN&view=netframework-4.7.2#thread-safety. |
| Comments |
| Comment by Githook User [ 30/Mar/21 ] | |
|
Author: {'name': 'James Kovacs', 'email': 'jkovacs@post.harvard.edu', 'username': 'JamesKovacs'}Message: | |
| Comment by James Kovacs [ 30/Mar/21 ] | |
|
dnickless On second thought... I've closed this case so we can track fix version as part of our release process. I've forked a new ticket CSHARP-3513 to continue work on LookupActualType improvements in PR #434. | |
| Comment by James Kovacs [ 29/Mar/21 ] | |
|
PR #482 has been incorporated into master (see above). We will leave this ticket open as PR #434 is rebased and performance re-evaluated in light of the PR #482 changes. | |
| Comment by Githook User [ 29/Mar/21 ] | |
|
Author: {'name': 'James Kovacs', 'email': 'jkovacs@post.harvard.edu', 'username': 'JamesKovacs'}Message: | |
| Comment by Daniel Hegener [ 06/Jan/21 ] | |
|
I also created https://github.com/mongodb/mongo-csharp-driver/pull/434 which tackles another performance bottleneck that sits in the same area of the code (LookupActualType) but also impacts serialization. | |
| Comment by Daniel Hegener [ 06/Jan/21 ] | |
|
Thanks, @boris.dogadov, I understand your concerns and also that this is a risky area of the code. However, kindly note that the
method currently does not synchronize access to the __discriminatedTypes field. And this is the only place where that field gets read... So, really, either there already is a potential race condition in the current code or the fact that both fields (__discriminatedTypes and __discriminators) are changed inside the same lock is more of a coincidence than evidence for a dependency between the two. Anyway, since this topic really is a big deal for us, I gave it a second shot: https://github.com/mongodb/mongo-csharp-driver/pull/433 This time, I didn't totally eliminate read locks in LookupActualType but at least shortened the locked section to a bare minimum. I haven't changed a lot really just made sure that the HashSet<Type> collections inside the __discriminators dictionary are kind of immutable. Also, I have added the probably missing read lock inside the above-mentioned IsTypeDiscriminated method. | |
| Comment by Boris Dogadov [ 28/Dec/20 ] | |
|
Currently there is a global assumption that all serialization configuration is protected by single synchronization primitive. Making any path lock-free would require careful inspection of all possible interleavings of read/writes of other configurations parts. One of the examples would be: __discriminatedTypes and __discriminators are expected to be always in a consistent state with each other (modified under the same lock). Making LookupActualType look-free, might result in an inconsistent observed state where __discriminators is updated and __discriminatedTypes still not. | |
| Comment by Daniel Hegener [ 19/Dec/20 ] | |
|
Would you mind elaborating a bit more on your findings, please? Based on my profiling back then (screenshot in the PR) and the benchmark I put on gist which yielded the results described in my other comment above, the improvements appeared rather striking and absolutely worth that change which - IIRC after all the years - was a small and safe thing, just not super pretty because of the moon-generic hashset... I'm happy to try harder again if you tell me what's not yet good in the suggested implementation. | |
| Comment by Boris Dogadov [ 18/Dec/20 ] | |
|
Hi dnickless, thanks for bringing up this issue and submitting the PR. | |
| Comment by Daniel Hegener [ 21/Feb/19 ] | |
|
Hi Ian I am sorry for not getting back any earlier but I've been pretty darn busy recently. Anyway, I have interesting news: I borrowed some fairly realistic test classes from https://csharp2json.io/. Using those, I hacked together the following "unit test": https://gist.github.com/dnickless/334cff0490e788759918c341f78563fa It's really not sophisticated in any way but it certainly does some multithreaded deserialization of 10m documents. On my workstation (2 sockets x 6 cores x Hyperthreading = 24 Cores), I see the following results: BEFORE: [02/21/2019 17:15:46 Informational] ------ Run test started ------
AFTER: [02/21/2019 17:31:24 Informational] ------ Run test started ------
I haven't tried using less cores or no HT or anything around .NET Core or other bitness etc... Nonetheless, assuming I made no mistake in continuing to guarantee thread-safe writes, this would appear to be a massive improvement. | |
| Comment by Ian Whalen (Inactive) [ 17/Dec/18 ] | |
|
dnickless what improvements are you seeing in your PR in both the profiler and actual throughput? | |
| Comment by Ian Whalen (Inactive) [ 17/Dec/18 ] | |