[SERVER-8133] mongoimport TSV,CSV and JSON always generate duplicate entries for all but the first record when importing Created: 10/Jan/13  Updated: 15/Feb/13  Resolved: 10/Jan/13

Status: Closed
Project: Core Server
Component/s: Tools
Affects Version/s: 2.2.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jeff Rule Assignee: Unassigned
Resolution: Done Votes: 0
Labels: commands, insert
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

verified on OSX 10.8.2, Centos 5.5 and Amazon Image from 10-Gen upgraded mongodb 2.2.2


Attachments: File huh.csv     File huh.json     File huh.tsv     Text File mongo-error.log    
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

1) create a file huh.tsv (see attachment)

2) execute the following load...

$ mongoimport -d local --type tsv --headerline \
--stopOnError --drop -c huh -file ./huh.tsv -v
Thu Jan 10 01:16:25 creating new connection to:127.0.0.1:27017
Thu Jan 10 01:16:25 BackgroundJob starting: ConnectBG
Thu Jan 10 01:16:25 connected connection!
connected to: 127.0.0.1
Thu Jan 10 01:16:25 ns: local.huh
Thu Jan 10 01:16:25 dropping: local.huh
Thu Jan 10 01:16:25 filesize: 13
Thu Jan 10 01:16:25 got line:myId
Thu Jan 10 01:16:25 got line:1
Thu Jan 10 01:16:25 got line:2
Thu Jan 10 01:16:25 got line:3
Thu Jan 10 01:16:25 got line:4
Thu Jan 10 01:16:25 imported 4 objects

3) mongo local should return 5 rows but we only get the following..

$ mongo local
MongoDB shell version: 2.2.2
connecting to: local
> db.huh.find()

{ "myId" : 1 }

>

I have attached the log file with verbose logging turned on below. We only get the dups messages with verbose turned on. I suspect you might also see this if you tried to use the --dbpath to directly access the files. I have seen other bugs mention that using the tool that way produces more output then in client/server mode.

Participants:

 Description   

When importing a simple file with 5 unique rows in it into a new container using mongoimport (I will give examples below into a new container). The import tool says all went well, but only the first row is is saved. I was able to produce this behavior for TSV, CSV and JSON files.

Staring up the mongodb amazon instance from the amazon market place and upgrading to mongodb 2.2.2 via yum would be the surest way to duplicate what I was seeing. To be specific I was using the public Amazon Instance ID: ami-62da7c0b as a starting point.



 Comments   
Comment by Eliot Horowitz (Inactive) [ 10/Jan/13 ]

the "local" database is special and is not intended for normal user use.
this will work against any other database

Generated at Thu Feb 08 03:16:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.