[SERVER-682] mongoimport in csv mode/type seems to have problems with whitespaces in column values Created: 26/Feb/10  Updated: 12/Jul/16  Resolved: 27/Feb/10

Status: Closed
Project: Core Server
Component/s: Tools
Affects Version/s: 1.2.1
Fix Version/s: 1.3.3

Type: Bug Priority: Major - P3
Reporter: Raphael Stolt Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Mac OSX 10.6.2, MongoDB installed via Macport


Attachments: File recordshelf-genres-jazzed.csv     File recordshelf-genres.csv    
Participants:

 Description   

When given a csv file (att: recordshelf-genres.csv) and imported via 'mongoimport -d recordshelf -c genres --file recordshelf-genres.csv --drop -f name --type csv --headerline --ignoreBlanks' the imported collection looks (is) incorrect possible due to whitespace issues in the column value 'Drum and Bass'. See next snippet from mongo shell showing the corrupted data.

db.genres.find()

{ "_id" : ObjectId("4b886702982b57076c82e282"), "name" : "Drum and Bass" } { "_id" : ObjectId("4b886702982b57076c82e283"), "name" : "Rap", "field1" : "and Bass\"" } { "_id" : ObjectId("4b886702982b57076c82e284"), "name" : "House", "field1" : "d Bass\"" } { "_id" : ObjectId("4b886702982b57076c82e285"), "name" : "Reggae", "field1" : "Bass\"" } { "_id" : ObjectId("4b886702982b57076c82e286"), "name" : "Dubstep", "field1" : "Bass\"" }

When column value 'Drum and Bass' get replaced (att: recordshelf-genres-jazzed.csv) with e.g. Jazz the csv mongoimport (same call as above with modified import csv) works as expected as shown in the next mongo shell excerpt.

db.genres.find()

{ "_id" : ObjectId("4b886860982b57076c82e287"), "name" : "Jazz" } { "_id" : ObjectId("4b886860982b57076c82e288"), "name" : "Rap" } { "_id" : ObjectId("4b886860982b57076c82e289"), "name" : "House" } { "_id" : ObjectId("4b886860982b57076c82e28a"), "name" : "Reggae" } { "_id" : ObjectId("4b886860982b57076c82e28b"), "name" : "Dubstep" }

Looks like a bug to me, but maybe I'm just to tired for seeing a trivial solution for this issue or I'm issuing mongoimport in a false way. If that's the case sorry for opening this issue ;D



 Comments   
Comment by Eliot Horowitz (Inactive) [ 27/Feb/10 ]

was a problem with " parsing when it was the last (or only) field

Comment by auto [ 27/Feb/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: fix for csv import where last field has " SERVER-682
http://github.com/mongodb/mongo/commit/00ca5d90fe11b3e3fa6c3651542113736adddcf1

Generated at Thu Feb 08 02:54:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.