Uploaded image for project: 'MongoDB Database Tools'
  1. MongoDB Database Tools
  2. TOOLS-2010

mongoimport may report incorrect number of imported documents

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: mongoimport
    • Labels:
      None
    • 3

      I think this needs to be revisited.

      The following applies to both MongoDB 3.4.14 and 3.6.2.

      Given a simple collection to import like the following in a file called test.json:

      { "_id": "1", "a": 1 }
      { "_id": "1", "a": 2 }
      { "_id": "1", "a": 3 }
      

      We get the proper output of:

      mongoimport --drop -d test -c test test.json
      2018-04-06T10:35:19.008-0400	connected to: localhost
      2018-04-06T10:35:19.008-0400	dropping: test.test
      2018-04-06T10:35:19.087-0400	num failures: 2
      2018-04-06T10:35:19.087-0400	error inserting documents: E11000 duplicate key error collection: test.test index: _id_ dup key: { : "11" }
      2018-04-06T10:35:19.087-0400	imported 1 document
      

      However, importing a much [^zips.json] larger dataset with only 3 duplicates (zips.json, attached to this ticket):

      mongoimport --drop -d test -c zips zips.json
      2018-04-06T10:37:26.801-0400	connected to: localhost
      2018-04-06T10:37:26.801-0400	dropping: test.zips
      2018-04-06T10:37:26.929-0400	error inserting documents: E11000 duplicate key error collection: test.zips index: _id_ dup key: { : "32350" }
      2018-04-06T10:37:27.004-0400	error inserting documents: E11000 duplicate key error collection: test.zips index: _id_ dup key: { : "63673" }
      2018-04-06T10:37:27.086-0400	error inserting documents: E11000 duplicate key error collection: test.zips index: _id_ dup key: { : "42223" }
      2018-04-06T10:37:27.125-0400	imported 29470 documents
      
      mongo test --eval 'db.zips.count()'
      MongoDB shell version v3.6.2
      connecting to: mongodb://127.0.0.1:27017/test
      MongoDB server version: 3.6.2
      29467
      

      Additional testing with mgeneratejs shows interesting results.

      mgeneratejs '{"_id": "$age", "someKey": 3}' -n 1000 > ages.1000.json 
      mgeneratejs '{"_id": "$age", "someKey": 3}' -n 2000 > ages.2000.json 
      

      Importing ages.1000.json imports only 48 documents.
      Importing ages.2000.json says it imports 1000 documents.

      Both collections only import 48 documents.

            Assignee:
            Unassigned Unassigned
            Reporter:
            nathan.leniz@mongodb.com Nathan Leniz
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: