[CSHARP-904] C# driver memory leak Created: 02/Feb/14  Updated: 05/Apr/16  Resolved: 20/Jun/14

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: 1.8.3
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Vincent Assignee: Unassigned
Resolution: Done Votes: 0
Labels: c#, driver, leak, memory, memory-leak
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows Server 2012 R2 x64 / 8.1 x64


Attachments: PNG File Real app - InstanceCategorizer.png     PNG File Real app - Summary.png     PNG File Test - InstanceCategorizer.PNG     PNG File Test - Summary.PNG     Zip Archive TestMemoryLeak.zip     PNG File ants_results_in_progress.png     Zip Archive memory_leak_test_results.zip    
Issue Links:
Duplicate
is duplicated by CSHARP-1021 Memory usages on Client grows without... Closed

 Description   

Memory leak in the MongoDB .NET driver.
My setup is 6 shards/2 servers per RS. Overloaded servers, lot of transport exceptions (might be related to the leak).
Eventually results in an "Out of memory" exception to be thrown (VERY fast in the real application, with objects up to 16MB).

I was able to reproduce the issue using the attached test solution (except it occurs much slower than in the real app, because objects are much smaller).

I'm not sure if the unreliable MongoDB servers are the cause of the memory leak or not.

I can't post the Ants Profiler results, because it could contain sensitive information (connection strings, etc.)



 Comments   
Comment by Vincent [ 20/Jun/14 ]

This problem just reappeared for me since I upgraded to 2.6.2 (and now 2.6.3, same issue). Everything is fine for some days, then I have a very sudden burst in memory consumption to the point I get a OutOfMemoryException (on a 32GB of RAM memory server!).
I'll try to gather more data about this.

Comment by Craig Wilson [ 31/Mar/14 ]

Hi Vincent/Peter,

After much trying and 3 different memory profilers, I simply can't find any leaks. I do see a steady rise in the quantity of memory required, but I believe this is related to the test program. It uses more threads than I have CPU's, and then attempts to parallelize upserts of 1000 documents. Ultimately, this will lead to a lot of documents being in memory at any given time. In addition, in order to serialize classes, we end up with 2x the size of a document in memory (one for the object and one for the serialized form). I also increased the size of the documents to about 10MB, which makes this much more prounounced. At some point in the distant future, things basically leveled off for me. I also added finalizers to all the classes that have lots of allocations and were taking a lot of memory, and none of those finalizers ever get called (because the Dispose methods were functioning correctly).

I'm out of ideas at this point, so I'd love to hear your thoughts. I'm not saying there isn't a problem, I'm just not finding it.
Craig

Comment by Craig Wilson [ 13/Mar/14 ]

I don't think it was ever assigned to me, I just happened to be triaging it with your help.

Update: not much in the way of good news. This kind of leak is extremely difficult to find and I haven't yet located it. Still looking. I'm trying different things. For instance, with your sample program, I've changed the size of the documents that get inserted/updated to 1MB. It should make this more pronouncable. I'm running it now...

Comment by Peter Aberline [ 13/Mar/14 ]

Hi
I've been running it for 24 hours and not seen any exceptions so far.

Any updates on this Craig? I see this issue is no longer assigned to you and is currently unassigned.

Thanks,
Peter

Comment by Vincent [ 11/Mar/14 ]

No, it catches the exceptions thrown on the upsert. In fact I noticed that the leak was happening all the time when my servers were overloaded (and a lot of exceptions were thrown), but I'm not sure it happens again on my system since I upgraded to much more powerful servers. I'm NOT SURE – which means it can still happens, but I didn't check. That's why I'm asking, maybe it's a good track to follow.
However I'm very pleased that another user confirms this issue I'm (was?) experiencing

try
{
	collection.Update(Query.EQ("_id",obj.Id),Update.Replace(obj),new MongoUpdateOptions {
		WriteConcern=WriteConcern.Acknowledged,
		Flags=UpdateFlags.Upsert
	});
}
catch(Exception e)
{ Console.WriteLine(e.Message); }

Comment by Peter Aberline [ 11/Mar/14 ]

Hi Vincent,

All I did was run the code you supplied. When it ran out of resources it did throw exceptions yes. But during execution it didn't appear to throw exceptions as your code stops the work thread in that case. To be sure I've added some explicit logging and I'm running it again.

Thanks
Peter.

Comment by Vincent [ 11/Mar/14 ]

Hi Peter,
Did you experienced mongo exceptions during the leak?

Comment by Peter Aberline [ 11/Mar/14 ]

Hi
Thanks for looking further into this. Since this is now by confirmed by at least 2 different users, with a supporting Ants profile showing a leak, can this case be escalated to a higher priority? I need to eliminate this as the cause of errors we have been seeing.

Thanks,
Peter

Comment by Craig Wilson [ 11/Mar/14 ]

Hi Peter,

Thanks for this run. What your profile results seem to show is that our BsonChunkPool is holding on to memory. This is what it is supposed to be doing as it is a pool of BsonChunks. I'm going to start investigating if we are leaking chunks that aren't getting returned. If that were to happen, I'd expect it to manifest itself a little differently, but you never know.

Craig

Comment by Peter Aberline [ 09/Mar/14 ]

Hi,
It's now failed while running within Ants Profiler.
In summary:

  • The working set started at around 27Mb.
  • It was reasonably stable for around 24 hours on ~30Mb.
  • After this time is started increasing gradually until 31 hours, plateuing at around 55Mb.
  • Again it was reasonably stable until 62 hours where it started increasing again, and crashed at 63 hours, with a working set of 65Mb.

I've attached the Ants Profile results as "memory_leak_test_results.zip".

Comment by Craig Wilson [ 07/Mar/14 ]

I've been running it since yesterday and not seeing much of anything. It slowly increases for an initial period and then levels, and stays consistent. I'm watching the Private Bytes and # Bytes in all Heaps performance counters and they aren't really showing much. I'll look forward to your report... Thanks for helping with this.

Comment by Peter Aberline [ 07/Mar/14 ]

I've been running this overnight on a test vm using Ant Profiler and I'm getting some interesting results. For the first 12 hours or so the memory usage was stable, but after that it seems to have sprung a small leak and now the memory usage of the test program is slowly increasing. I've posted a screenshot as: "ants_results_in_progress.png"

I'll leave it running over the weekend and post the Ants results when it finally runs out of resources.

Thanks
Peter

Comment by Craig Wilson [ 06/Mar/14 ]

Ok. I'll let it run overnight and see what happens.

Comment by Peter Aberline [ 06/Mar/14 ]

I reproduced it with a single node running on my dev box. No shards, no replica sets.
The only change I made to the test program was to the change the connection string to "mongodb://127.0.0.1:27017/TestDev".

Comment by Craig Wilson [ 06/Mar/14 ]

I didn't let it run overnight, so I probably just didn't let it run long enough... Can you tell me about your shards? Is one up, one down, are they all up? are they all down? It breaks early if there are none up, and doesn't break if they are all available. I also was using a server on my local box, so what exactly is the setup you're using when running this?

Comment by Peter Aberline [ 06/Mar/14 ]

And this was the output of the test program:

12428 - Started
12429 - Started
12430 - Started
Too many threads are already waiting for a connection.

Unhandled Exception: Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.

Unhandled Exception: Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
System.AggregateException: One or more errors occurred. ---> MongoDB.Driver.MongoConnectionException: Too many threads are already waiting for a conne
ction.
at MongoDB.Driver.Internal.MongoConnectionPool.AcquireConnection(AcquireConnectionOptions options)
at MongoDB.Driver.MongoServerInstance.AcquireConnection()
at MongoDB.Driver.MongoServer.AcquireConnection(ReadPreference readPreference)
at MongoDB.Driver.MongoCursor`1.MongoCursorConnectionProvider.AcquireConnection()
at MongoDB.Driver.Operations.QueryOperation`1.GetFirstBatch(IConnectionProvider connectionProvider)
at MongoDB.Driver.Operations.QueryOperation`1.<Execute>d__0.MoveNext()
at System.Collections.Concurrent.Partitioner.DynamicPartitionerForIEnumerable`1.InternalPartitionEnumerable.GrabChunk_Buffered(KeyValuePair`2[] des
tArray, Int32 requestedChunkSize, Int32& actualNumElementsGrabbed)
at System.Collections.Concurrent.Partitioner.DynamicPartitionerForIEnumerable`1.InternalPartitionEnumerator.GrabNextChunk(Int32 requestedChunkSize)

at System.Collections.Concurrent.Partitioner.DynamicPartitionEnumerator_Abstract`2.MoveNext()
at System.Threading.Tasks.Parallel.<>c_DisplayClass32`2.<PartitionerForEachWorker>b_30()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c_DisplayClass11.<ExecuteSelfReplicating>b_10(Object param0)
— End of inner exception stack trace —
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Task.Wait()
at System.Threading.Tasks.Parallel.PartitionerForEachWorker[TSource,TLocal](Partitioner`1 source, ParallelOptions parallelOptions, Action`1 simpleB
ody, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 local
Finally)
at System.Threading.Tasks.Parallel.ForEachWorker[TSource,TLocal](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body, Action`2 bod
yWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEach[TSource](IEnumerable`1 source, Action`1 body)
at TestMemoryLeak.Program.runTest(String threadName) in c:\Users\paberline\Downloads\TestMemoryLeak\TestMemoryLeak\TestMemoryLeak\Program.cs:line 8
7
at TestMemoryLeak.Program.<>c_DisplayClass7.<Main>b_2() in c:\Users\paberline\Downloads\TestMemoryLeak\TestMemoryLeak\TestMemoryLeak\Program.cs:l
ine 67
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx
)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()Too many threads are already waiting for a connection.

System.AggregateException: One or more errors occurred. ---> MongoDB.Driver.MongoConnectionException: Too many threads are already waiting for a conne
ction.
at MongoDB.Driver.Internal.MongoConnectionPool.AcquireConnection(AcquireConnectionOptions options)
at MongoDB.Driver.MongoServerInstance.AcquireConnection()
at MongoDB.Driver.MongoServer.AcquireConnection(ReadPreference readPreference)
at MongoDB.Driver.MongoCursor`1.MongoCursorConnectionProvider.AcquireConnection()
at MongoDB.Driver.Operations.QueryOperation`1.GetFirstBatch(IConnectionProvider connectionProvider)
at MongoDB.Driver.Operations.QueryOperation`1.<Execute>d__0.MoveNext()
at System.Collections.Concurrent.Partitioner.DynamicPartitionerForIEnumerable`1.InternalPartitionEnumerable.GrabChunk_Buffered(KeyValuePair`2[] des
tArray, Int32 requestedChunkSize, Int32& actualNumElementsGrabbed)
at System.Collections.Concurrent.Partitioner.DynamicPartitionerForIEnumerable`1.InternalPartitionEnumerator.GrabNextChunk(Int32 requestedChunkSize)

at System.Collections.Concurrent.Partitioner.DynamicPartitionEnumerator_Abstract`2.MoveNext()
at System.Threading.Tasks.Parallel.<>c_DisplayClass32`2.<PartitionerForEachWorker>b_30()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c_DisplayClass11.<ExecuteSelfReplicating>b_10(Object param0)
— End of inner exception stack trace —
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Task.Wait()
at System.Threading.Tasks.Parallel.PartitionerForEachWorker[TSource,TLocal](Partitioner`1 source, ParallelOptions parallelOptions, Action`1 simpleB
ody, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 local
Finally)
at System.Threading.Tasks.Parallel.ForEachWorker[TSource,TLocal](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body, Action`2 bod
yWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEach[TSource](IEnumerable`1 source, Action`1 body)
at TestMemoryLeak.Program.runTest(String threadName) in c:\Users\paberline\Downloads\TestMemoryLeak\TestMemoryLeak\TestMemoryLeak\Program.cs:line 8
7
at TestMemoryLeak.Program.<>c_DisplayClass7.<Main>b_2() in c:\Users\paberline\Downloads\TestMemoryLeak\TestMemoryLeak\TestMemoryLeak\Program.cs:l
ine 67
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx
)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()Too many threads are already waiting for a connection.

Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
12430 - Upserting new stuff
12429 - Upserting new stuff
11568 - 7,011,192 Bytes
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
Too many threads are already waiting for a connection.
11569 - 6,838,144 Bytes

Comment by Peter Aberline [ 06/Mar/14 ]

Hi Craig,
Thanks for the update. That's strange you can't reproduce it. My desktop dev environment is pretty standard. All I did was run the test program overnight. I'm running Mongo 2.4.8 on my desktop machine.

Comment by Craig Wilson [ 06/Mar/14 ]

Hi Peter. I'm looking into it. Nothing jumped out as to the cause and the problems don't seem to occur when the servers are stable and only occasionally when they aren't (and even then, it seems to self-correct), so I'm really just trying to reproduce still.

Comment by Peter Aberline [ 06/Mar/14 ]

Any updates on this? I've looked through github commits and not seen any check-ins for this.
I've been able to reproduce this on my local dev machine using the example program connected to a local single node Mongo and I'm currently running it under Ants Memory Profiler. I'll post the results when it fails.

We have seen "Timeout waiting for a connection" and "Too many threads are already waiting for a connection" exceptions during our testing. I increased the connectionTimeout, maxPoolSize, waitQueueSize and waitQueueTimeout parameters in the client connection string to address this but it would be good to be able to eliminate this leak as the cause.

Thanks
Peter

Comment by Craig Wilson [ 03/Feb/14 ]

Thanks for the report Vincent. We'll begin looking into this and try to repro.

Craig

Comment by Vincent [ 03/Feb/14 ]

Seems really related to exceptions thrown by the drivers, but I don't know which ones exactly. I think some stuff aren't correctly disposed when an exception is thrown.
Exceptions I remember:

  • No master found
  • Can't connect
  • Timeout waiting for a connection
  • Too many threads are already waiting for a connection

When the servers are OK, I don't have the memory leaks.
This will be pretty hard to reproduce I guess...

Generated at Wed Feb 07 21:38:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.