Uploaded image for project: 'Java Driver'
  1. Java Driver
  2. JAVA-3810

Performance bottleneck in bulk insertion

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Gone away
    • Icon: Major - P3 Major - P3
    • None
    • 3.12.6
    • None
    • None

    Description

      Sample code:

      public static void bulkInsert() {
      		MongoClient mongoClient = new MongoClient(new MongoClientURI("mongodb://192.168.140.129:27017/"));
      		WriteConcern wc = new WriteConcern(0).withJournal(false);
       
      		String databaseName = "test";
      		String collectionName = "testCollection";
       
      		System.out.println("Database: " + databaseName);
      		System.out.println("Collection: " + collectionName);
      		System.out.println("Write concern: " + wc);
       
      		MongoDatabase database = mongoClient.getDatabase(databaseName);
       
      		MongoCollection<Document> collection = database.getCollection(collectionName).withWriteConcern(wc);
       
      		int rows = 1000000;
      		int iterations = 5;
       
      		double accTime = 0;
       
      		for (int it = 0; it < iterations; it++) {
      			database.drop();
       
      			List<InsertOneModel<Document>> docs = new ArrayList<>();
       
      			int batchSize = 1000;
      			int batch = 0;
       
      			long start = System.currentTimeMillis();
       
      			for (int i = 0; i < rows; ++i) {
      				String key1 = "7";
      				String key2 = "8395829";
      				String key3 = "928749";
      				String key4 = "9";
      				String key5 = "28";
      				String key6 = "44923.59";
      				String key7 = "0.094";
      				String key8 = "0.29";
      				String key9 = "e";
      				String key10 = "r";
      				String key11 = "2020-03-16";
      				String key12 = "2020-03-16";
      				String key13 = "2020-03-16";
      				String key14 = "klajdlfaijdliffna";
      				String key15 = "933490";
      				String key17 = "paorgpaomrgpoapmgmmpagm";
       
      				Document doc = new Document("key17", key17).append("key12", key12).append("key7", key7)
      						.append("key6", key6).append("key4", key4).append("key10", key10).append("key1", key1)
      						.append("key2", key2).append("key5", key5).append("key13", key13).append("key9", key9)
      						.append("key11", key11).append("key14", key14).append("key15", key15).append("key3", key3)
      						.append("key8", key8);
       
      				docs.add(new InsertOneModel<>(doc));
       
      				batch++;
       
      				if (batch >= batchSize) {
      					collection.bulkWrite(docs);
      					docs.clear();
      					batch = 0;
      				}
      			}
       
      			if (batch > 0) {
      				collection.bulkWrite(docs);
      				docs.clear();
      			}
       
      			long end = System.currentTimeMillis();
      			double elapsedSecs = (end - start) / 1000.0;
       
      			accTime += elapsedSecs;
       
      			System.out.println("Iteration " + it + " - Elapsed: " + elapsedSecs + " seconds.");
      		}
       
      		System.out.println("Avg: " + (accTime / iterations) + " seconds.");
      		
      		mongoClient.close();
      	}
      

      The performance of bulk insertion does not improve when increasing the batch size to 5000 or above. The following are the execution times for batch sizes 100, 1000, 5000 and 10000.

      batch size 100
      Iteration 0 - Elapsed: 10.418 seconds.
      Iteration 1 - Elapsed: 10.09 seconds.
      Iteration 2 - Elapsed: 10.385 seconds.
      Iteration 3 - Elapsed: 9.806 seconds.
      Iteration 4 - Elapsed: 9.979 seconds.
      Avg: 10.1356 seconds.

      batch size 1000
      Iteration 0 - Elapsed: 6.99 seconds.
      Iteration 1 - Elapsed: 6.41 seconds.
      Iteration 2 - Elapsed: 6.654 seconds.
      Iteration 3 - Elapsed: 6.845 seconds.
      Iteration 4 - Elapsed: 6.736 seconds.
      Avg: 6.726999999999999 seconds.

      batch size 5000
      Iteration 0 - Elapsed: 7.536 seconds.
      Iteration 1 - Elapsed: 7.891 seconds.
      Iteration 2 - Elapsed: 7.83 seconds.
      Iteration 3 - Elapsed: 7.951 seconds.
      Iteration 4 - Elapsed: 7.939 seconds.
      Avg: 7.8294 seconds.

      batch size 10000
      Iteration 0 - Elapsed: 7.198 seconds.
      Iteration 1 - Elapsed: 6.967 seconds.
      Iteration 2 - Elapsed: 8.134 seconds.
      Iteration 3 - Elapsed: 8.092 seconds.
      Iteration 4 - Elapsed: 8.083 seconds.
      Avg: 7.694799999999999 seconds.

      Attachments

        Activity

          People

            ross@mongodb.com Ross Lawley
            azitouni@magnitude.com Aziz Zitouni
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: