Uploaded image for project: 'MongoDB Database Tools'
  1. MongoDB Database Tools
  2. TOOLS-1368

mongimport of TSV/CSV file with Byte Order Mark causes incorrect field name

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Duplicate
    • 3.2.8
    • None
    • mongoimport
    • None
    • Customer found this on Windows. I reproduced it on OS X.

    Description

      Create a file in CSV or TSV format, with a header line, with a UTF-8 Byte Order Mark at the beginning (0xefbbbf). See such a file attached. mongoimport --type tsv --headerline on the file into a collection.

      Note that the first field's name in the database has a UTF-16 BOM in front of it. This is invisible during normal shell work, but it's there and prevents queries on that field from working. Here's a terminal session showing the issue.

      mongoimport --db test --collection foo --file jets_agency_systems.html --type tsv --headerline
      mongo
      MongoDB shell version: 3.2.8
      connecting to: test
      > db.foo.findOne()
      {
      	"_id" : ObjectId("57ab94a12486607096162845"),
      	"ID" : "C",
      	"AGENCY_TEXT" : "US AIR FORCE"
      }
      > db.foo.find({ID:'C'})
      > db.foo.find({"\ufeffID":'C'})
      { "_id" : ObjectId("57ab94a12486607096162845"), "ID" : "C", "AGENCY_TEXT" : "US AIR FORCE" }
      >
      

      Attachments

        1. jets_agency_systems.html
          0.9 kB
        2. working-issue.csv
          0.0 kB

        Issue Links

          Activity

            People

              gabriel.russell@mongodb.com Gabriel Russell (Inactive)
              spencer.brown@mongodb.com Spencer Brown
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: