Uploaded image for project: 'MongoDB Database Tools'
  1. MongoDB Database Tools
  2. TOOLS-1368

mongimport of TSV/CSV file with Byte Order Mark causes incorrect field name

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.8
    • Component/s: mongoimport
    • None
    • Environment:
      Customer found this on Windows. I reproduced it on OS X.

      Create a file in CSV or TSV format, with a header line, with a UTF-8 Byte Order Mark at the beginning (0xefbbbf). See such a file attached. mongoimport --type tsv --headerline on the file into a collection.

      Note that the first field's name in the database has a UTF-16 BOM in front of it. This is invisible during normal shell work, but it's there and prevents queries on that field from working. Here's a terminal session showing the issue.

      mongoimport --db test --collection foo --file jets_agency_systems.html --type tsv --headerline
      mongo
      MongoDB shell version: 3.2.8
      connecting to: test
      > db.foo.findOne()
      {
      	"_id" : ObjectId("57ab94a12486607096162845"),
      	"ID" : "C",
      	"AGENCY_TEXT" : "US AIR FORCE"
      }
      > db.foo.find({ID:'C'})
      > db.foo.find({"\ufeffID":'C'})
      { "_id" : ObjectId("57ab94a12486607096162845"), "ID" : "C", "AGENCY_TEXT" : "US AIR FORCE" }
      >
      

        1. working-issue.csv
          0.0 kB
        2. jets_agency_systems.html
          0.9 kB

            Assignee:
            gabriel.russell@mongodb.com Gabriel Russell (Inactive)
            Reporter:
            spencer.brown@mongodb.com Spencer Brown
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: