Details
-
Task
-
Resolution: Done
-
Major - P3
-
None
-
None
-
None
Description
Hi,
I have no more ideas what to look after therefore I am writing this question here. I hope it is okay:
I am running a batch process to import 1.4Mio Hotels into our system. I use the Task Parallel Library for it, there is no .Wait() or .Result
My code looks like this:
public sealed class BatchWriter : IBatchWriter<Hotel>
|
{
|
private readonly IMongoCollection<MongoTransformedHotel> collection;
|
private readonly List<MongoTransformedHotel> transformedHotels = new List<MongoTransformedHotel>();
|
|
|
public BatchWriter(IMongoCollection<MongoTransformedHotel> collection)
|
{
|
this.collection = collection;
|
}
|
|
|
public void Dispose()
|
{
|
}
|
|
|
public Task WriteBatch(IEnumerable<Hotel> items)
|
{
|
foreach (var hotel in items)
|
{
|
....
|
|
|
this.transformedHotels.Add(new MongoTransformedHotel { Id = factsheetId.FactsheetId, HotelCode = hotelCode.Code, Hotel = hotel });
|
}
|
|
|
return TaskHelper.Done;
|
}
|
|
|
public Task Complete()
|
{
|
var models =
|
this.transformedHotels.Select(x =>
|
new ReplaceOneModel<MongoTransformedHotel>(Builders<MongoTransformedHotel>.Filter.Eq(y => y.Id, x.Id), x)
|
{
|
IsUpsert = true
|
});
|
|
|
return this.collection.BulkWriteAsync(models);
|
}
|
}
|
My Batch Size is 1000 Hotels, the documents are complex.
After usually around 400.000 Hotels the process hangs. When I Break All, I see no threads waiting for a lock or so, but there is one task and debugger points to "await batch.Complete()" in the following code. The status of the task is awaiting. I have absolutely no idea what is going on there. Could it be running out of connections or so?
new ActionBlock<T[]>(async b =>
|
{
|
using (var batch = writer.StartWriteBatch())
|
{
|
await batch.WriteBatch(b);
|
await batch.Complete();
|
}
|
}, new ExecutionDataflowBlockOptions {BoundedCapacity = 2});
|
|