[CSHARP-1942] Reducing delegate allocations Created: 13/Mar/17  Updated: 15/Jan/21  Resolved: 13/Jan/21

Status: Closed
Project: C# Driver
Component/s: Performance
Affects Version/s: 2.4.3
Fix Version/s: 2.12.0

Type: Improvement Priority: Major - P3
Reporter: Aristarkh Zagorodnikov Assignee: Boris Dogadov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: JPEG File Current.jpg     JPEG File Optimized.jpg     PNG File image-2021-01-13-10-25-42-111.png     PNG File image-2021-01-13-10-30-34-813.png    
Issue Links:
Duplicate
duplicates CSHARP-2504 Reduce memory consumption because of ... Closed
Backwards Compatibility: Fully Compatible

 Description   

Hi!

I would like to propose a couple of simple code changes that reduces amount of allocations, reducing GC load and increasing performance.
In one of our systems (an ASP.NET application with ~6GiB working set) the described sources of allocation account for ~2.2% of objects created.

The first source of unnecessary allocations is the BsonSerializerRegistry.GetSerializer(Type) method. It uses the compiler-provided method group -> delegate conversion as concurrent dictionary object factory. Unfortunately, current compilers do not cache the created delegate, recreating it on every call (https://github.com/dotnet/roslyn/issues/5835). The suggested fix is simple – caching the delegate in member variable.

The second source is a bit more complex. The BsonWriter allows for custom element name validation rules that potentially depend on the current element name. To implement this, the element name validator factory is represented as a delegate, that is changed when the associated state (stack/name) is changed. But, all three delegates (in PopElementNameValidator, PushElementNameValidator and WriteStartDocument) refer to an instance member (and even to a local variable) which makes them ineligible for caching. The proposed improvement replaces delegate-based validator creation with a more direct one.

The benchmark code is available at https://github.com/onyxmaster/mongobench.



 Comments   
Comment by Boris Dogadov [ 08/Jan/21 ]

Thanks onyxmaster for this contribution.
The similar optimization in BsonWriter.cs is appliedĀ here.

The optimization in BsonSerializerRegistry.cs looks good. Could you please rebase and leave the BsonSerializerRegistry.cs changes?

Comment by Aristarkh Zagorodnikov [ 07/Dec/20 ]

I wonder why would you close an older issue (from 2017) as a duplicate of a newer one (from 2019) instead of vice versa, even when the newer one contains neither a better description of the issue nor a better solution (the PR in the second issue does not deal with the root of the issue, the delegate could be removed altogether), while also not covering one of the issues (see BsonSerializerRegistry).

Comment by Aristarkh Zagorodnikov [ 13/Mar/17 ]

BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4810MQ CPU 2.80GHz, ProcessorCount=8
Frequency=2728058 Hz, Resolution=366.5611 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0
  Job-MXQTWF : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0
 
Jit=RyuJit  Platform=X64  Force=False  
Server=True  

Before:

Method Mean StdDev
WriteMessage 440.3268 ns 2.5639 ns
GetSerializer 50.0609 ns 0.3526 ns

After:

Method Mean StdDev
WriteMessage 397.6803 ns 3.8869 ns
GetSerializer 42.8741 ns 0.4082 ns
Comment by Aristarkh Zagorodnikov [ 13/Mar/17 ]

https://github.com/mongodb/mongo-csharp-driver/pull/273

Generated at Wed Feb 07 21:41:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.