Uploaded image for project: 'C# Driver'
  1. C# Driver
  2. CSHARP-1308

Difference in bulk() insert performance between javascript and C#

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Performance
    • None
    • Environment:
      Windows 2012 R2, 64bit. MongoDB 3.0.3, WT, zlib, 1GB Cache size

      Hi,

      Issue summary.

      Originally, customer has reported degradation over time when inserting with single-thread into random fields covered by indexes. This is expected and was easily reproduced. However, when running Javascript with db.collection.initializeUnorderedBulkOp(), no degradation over time was observed. This is easily reproduced and repeatable.

      But after the customer has implemented bulk() in his C# code, the same degradation over time was observed.

      Expected behaviour.

      C# program to go faster (or at least not worse) than a Javascript script running in the shell.

      Details.

      Indexes

      db.collection.createIndex( { String: 1} )
      db.collection.createIndex( { Number: 1} )
      db.collection.createIndex( { Number: 1, String: -1}  )
      

      Documents

      {
        "_id" : ObjectId("548f2631ea2de960b50f3fcb"),
        "String" : "Hello world 1587206698",
        "Number" : 1653570026
      }
      

      Javascript

      function insert(count) {
          every = 1000
          var t = new Date()
          var randomNum = Math.random();
          var intField = Math.floor(randomNum * 10000000000);
          var stringField_1 = randomNum.toString(36).substring(2, 12);
          var stringField_2 = "Hello World ";
          for (var i = 0; i < count;) {
              var bulk = db.data.initializeUnorderedBulkOp();
              for (var j = 0; j < every; j++, i++)
                  bulk.insert({
                      ranInt: Math.floor(randomNum * 10000000000),
                      ranString: stringField_2.concat(stringField_1)
                  })
              bulk.execute();
          }
      }
      insert(10000000000)
      

      C#

      using System;
      using System.Collections.Generic;
      using System.Diagnostics;
      using System.IO;
      using System.Linq;
      using System.Threading;
      using System.Threading.Tasks;
      using MongoDB.Bson;
      using MongoDB.Driver;
      
      namespace MongoDB
      {
          internal sealed class Entity
          {
              public string A { get; set; }
              public int B { get; set; }
      
          }
      
          static class Program
          {
              private static bool shouldStop = false;
              private static long counter = 0;
      
              public static void Main()
              {
                  var client = new MongoClient("mongodb://localhost");
                  var database = client.GetDatabase("test");
                  var collection = database.GetCollection<Entity>("entities");
                  var tasks = new List<Task>();
                  for (var i = 0; i < 8; i++)
                  {
                      tasks.Add(Task.Run(() => WriteEntitiesAsync(collection)));    
                  }
                  tasks.Add(WriteStatisticsAsync());
                  Console.ReadLine();
                  shouldStop = true;
                  Task.WaitAll(tasks.ToArray());
              }
      
              private static async Task WriteEntitiesAsync(IMongoCollection<Entity> collection)
              {
                  var random = new Random();
                  var task = Task.Delay(0);
                  while (!shouldStop)
                  {
                      var entities = Enumerable.Range(0, 1000).Select(_ => new Entity()).ToArray();
                      foreach (var entity in entities)
                      {
                          entity.A = random.Next() + random.Next().ToString() + random.Next();
                          entity.B = random.Next();
                      }
      
                      await task;
      
                      var length = entities.Length;
                      task = collection.InsertManyAsync(
                          entities,
                          new InsertManyOptions {IsOrdered = false}).
                          ContinueWith(_ => Interlocked.Add(ref counter, length));
                  }
              }
      
              private static async Task WriteStatisticsAsync()
              {
                  using (var fileStream = new FileStream("Data.csv", FileMode.Append, FileAccess.Write))
                  using (var streamWriter = new StreamWriter(fileStream))
                  {
                      while (!shouldStop)
                      {
                          counter = 0;
                          await Task.Delay(10000);
                          var line = string.Format("{0},{1}", DateTime.UtcNow, Interlocked.Read(ref counter) / 10);
                          await streamWriter.WriteLineAsync(line);
                          streamWriter.Flush();
                          Console.WriteLine(line);
                      }
                  }
              }
          }
      }
      

      Environment
      Windows 2012 R2, 64bit.
      MongoDB 3.0.3 (with this build), WT, zlib, indexCompression, 1GB Cache size.

      If needed, I can provide the server & access I am running these tests on, it is on AWS.

      Thank you,
      Dima

            Assignee:
            Unassigned Unassigned
            Reporter:
            dmitry.agranat@mongodb.com Dmitry Agranat
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: