Uploaded image for project: 'MongoDB Database Tools'
  1. MongoDB Database Tools
  2. TOOLS-2971

mongoimport 100.5.0 on MacOS fails to ignore 'invalid' documents on import, stop on error behaviour is default and cannot be unset?

    • Type: Icon: Investigation Investigation
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 100.5.0
    • Component/s: mongoimport
    • None

      Problem Statement/Rationale

      Using MongoDB 4.4.9 Community with mongoimport 100.5.0, same behaviour happens with MongoDB 5.0.3 Community

      Importing geojson data on a collection in a database with a 2dsphere index defined upfront.

      Defining the index before import helps to prevent 'invalid' geojson documents being ingested into the collection.

      This works with earlier versions of mongoimport but fails with version 100.5.0.

      It only imports up to a batch where it finds errors (valid json but invalid geojson according to mongodb) and then stops.

      In previous versions it reported 'invalid' geojson documents but continued.

      Looks like stop on error works by default and cannot be unset?

      (Not using the stopOnError parameter btw, would be nice to use that here with false flag)

      Steps to Reproduce

      How could an engineer replicate the issue you’re reporting?

      Run below command with previous versions of mongoimport

      mongoimport --uri mongodb://127.0.0.1:27017/test --collection polygons --type json --file antarctica-latest-polygons.seq.osm.json

      Expected Results

      Import of all 'valid' documents where the 2dsphere index helps to prevent data that cannot be indexed from indexing the system. 

      Actual Results

      What do you observe is happening?

      Fails after first batch with 'invalid' documents with 100.5.0

      021-10-18T14:15:38.234+0200 942 document(s) imported successfully. 58 document(s) failed to import.

      mongoimport --version
      mongoimport version: 100.5.0
      git version: 460c7e26f65c4ce86a0b99c46a559dccaba3a07d
      Go version: go1.16.3
      os: darwin
      arch: amd64
      compiler: gc

      Succeeds with earlier versions:

      2021-10-18T14:29:43.945+0200 imported 67616 documents

      mongoimport version: r4.0.20git version: e2416422da84a0b63cde2397d60b521758b56d1b
      Go version: go1.11.13
      os: darwin
      arch: amd64
      compiler: gc

      Additional Notes

      Any additional information that may be useful to include.

      Note that number reported to be imported is not same as documents in the database: input file has 67617 documents, mongoimport reports 67616 imported, collection actually contains 67182 documents.

      Another note is that repeating breaking import with 100.5.0 reports different numbers of files imported on different runs, sometimes 942, or 943, or 941. 

       

            Assignee:
            tim.fogarty@mongodb.com Tim Fogarty
            Reporter:
            emil.zegers@mongodb.com Emil Zegers (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: