-
Type:
Task
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Hi,
I have no more ideas what to look after therefore I am writing this question here. I hope it is okay:
I am running a batch process to import 1.4Mio Hotels into our system. I use the Task Parallel Library for it, there is no .Wait() or .Result
My code looks like this:
public sealed class BatchWriter : IBatchWriter<Hotel> { private readonly IMongoCollection<MongoTransformedHotel> collection; private readonly List<MongoTransformedHotel> transformedHotels = new List<MongoTransformedHotel>(); public BatchWriter(IMongoCollection<MongoTransformedHotel> collection) { this.collection = collection; } public void Dispose() { } public Task WriteBatch(IEnumerable<Hotel> items) { foreach (var hotel in items) { .... this.transformedHotels.Add(new MongoTransformedHotel { Id = factsheetId.FactsheetId, HotelCode = hotelCode.Code, Hotel = hotel }); } return TaskHelper.Done; } public Task Complete() { var models = this.transformedHotels.Select(x => new ReplaceOneModel<MongoTransformedHotel>(Builders<MongoTransformedHotel>.Filter.Eq(y => y.Id, x.Id), x) { IsUpsert = true }); return this.collection.BulkWriteAsync(models); } }
My Batch Size is 1000 Hotels, the documents are complex.
After usually around 400.000 Hotels the process hangs. When I Break All, I see no threads waiting for a lock or so, but there is one task and debugger points to "await batch.Complete()" in the following code. The status of the task is awaiting. I have absolutely no idea what is going on there. Could it be running out of connections or so?
new ActionBlock<T[]>(async b => { using (var batch = writer.StartWriteBatch()) { await batch.WriteBatch(b); await batch.Complete(); } }, new ExecutionDataflowBlockOptions {BoundedCapacity = 2});