Uploaded image for project: 'MongoDB Tools'
  1. MongoDB Tools
  2. TOOLS-67

Give mongoimport of CSV/TSVs a way to specify import types

    Details

      Description

      When using mongoimport to import a CSV, the field "0123" will get imported as the number 123 no matter what, there's no way to get mongoimport to treat it as a String.

      It could be nice to have an option to specify what type each field should be interpreted as. This could also be useful if we start supporting importing types beyond just strings and numbers (dates, for example).

        Issue Links

          Activity

          Hide
          scotthernandez Scott Hernandez added a comment -

          This sounds like a great place to write your own import script (like with python/ruby/perl/etc) instead of the tool doing it.

          Show
          scotthernandez Scott Hernandez added a comment - This sounds like a great place to write your own import script (like with python/ruby/perl/etc) instead of the tool doing it.
          Hide
          alangrus Alan Gruskoff added a comment -

          Same for me.
          Picking up the leading zero in Zip Codes is often a problem. In
          trying to get mongoimport to ingest a CSV file of Zip Codes, I found
          that the -type csv mode does not respect double quotes around a string
          like "01001" instead posting 1001 as a number.

          CSV Import of: "01001","Agawam","MA"
          results in this:

          { "_id" : 1001, "city" : "Agawam", "state" : "MA" }

          We need a way to specify an imported field is String or Decimal with number of Decimal places. Perhaps a Field : Type map the user could specify.

          Show
          alangrus Alan Gruskoff added a comment - Same for me. Picking up the leading zero in Zip Codes is often a problem. In trying to get mongoimport to ingest a CSV file of Zip Codes, I found that the -type csv mode does not respect double quotes around a string like "01001" instead posting 1001 as a number. CSV Import of: "01001","Agawam","MA" results in this: { "_id" : 1001, "city" : "Agawam", "state" : "MA" } We need a way to specify an imported field is String or Decimal with number of Decimal places. Perhaps a Field : Type map the user could specify.
          Hide
          grahamhar Graham Hargreaves added a comment -

          same here, we have seen an issue where if the field in the csv is null the type gets set to 6 which is not even defined in the docs e.g.

          > db.messagingProfile.findOne({inDate: {$type: 18},inDate: null},

          {inDate:1}

          )

          { "_id" : ObjectId("4edc1d3ac3e40219507a5ec2"), "inDate" : null }

          > db.messagingProfile.findOne({inDate: {$type: 6},inDate: null},

          {inDate:1}

          )

          { "_id" : ObjectId("4edc1d3ac3e40219507a5ec2"), "inDate" : null }

          As the type 6 isn't defined from what I can see I am going to raise a separate case

          Show
          grahamhar Graham Hargreaves added a comment - same here, we have seen an issue where if the field in the csv is null the type gets set to 6 which is not even defined in the docs e.g. > db.messagingProfile.findOne({inDate: {$type: 18},inDate: null}, {inDate:1} ) { "_id" : ObjectId("4edc1d3ac3e40219507a5ec2"), "inDate" : null } > db.messagingProfile.findOne({inDate: {$type: 6},inDate: null}, {inDate:1} ) { "_id" : ObjectId("4edc1d3ac3e40219507a5ec2"), "inDate" : null } As the type 6 isn't defined from what I can see I am going to raise a separate case
          Hide
          pawel.krupinski Pawel Krupinski added a comment -

          Same here.
          I needed to update some rows with a new property, but since my ids were strings Mongo created new records instead of updating them.
          CSV is a format common enough so that mongo should allow easy import from it.

          Show
          pawel.krupinski Pawel Krupinski added a comment - Same here. I needed to update some rows with a new property, but since my ids were strings Mongo created new records instead of updating them. CSV is a format common enough so that mongo should allow easy import from it.
          Hide
          dcegielski Deyna Cegielski added a comment -

          Is there any work around for this?

          I've tried wrapping numerical string values in quotes and escaping them but they just end up as a string containing the quotes e.g. ""010"".

          Show
          dcegielski Deyna Cegielski added a comment - Is there any work around for this? I've tried wrapping numerical string values in quotes and escaping them but they just end up as a string containing the quotes e.g. ""010"".
          Hide
          spencer Spencer Brody added a comment -

          If you need different behavior around the handing of types than mongoimport provides, the best workaround is to write a script to do the import yourself. CSV is pretty straightforward to parse and there are many CSV parsing implementations available online. mongoimport is only designed to be used in the very simple, straightforward cases where no special handling of types is required.

          Show
          spencer Spencer Brody added a comment - If you need different behavior around the handing of types than mongoimport provides, the best workaround is to write a script to do the import yourself. CSV is pretty straightforward to parse and there are many CSV parsing implementations available online. mongoimport is only designed to be used in the very simple, straightforward cases where no special handling of types is required.
          Hide
          rmarscher Rob Marscher added a comment -

          Sorry to comment on an old issue, but I agree that a CSV value of "001", should be imported as "001" and not converted to a integer with a value of 1. Feels like a bug to me and not a case of needing "special handling" of the type. Thanks.

          Show
          rmarscher Rob Marscher added a comment - Sorry to comment on an old issue, but I agree that a CSV value of "001", should be imported as "001" and not converted to a integer with a value of 1. Feels like a bug to me and not a case of needing "special handling" of the type. Thanks.
          Hide
          mark@igeno.com Mark Clancy added a comment -

          +1. Agree that the import doesn't need to be all-singing/dancing; however, the basics should be there — especially this change and proper mapping of date fields on import.

          Show
          mark@igeno.com Mark Clancy added a comment - +1. Agree that the import doesn't need to be all-singing/dancing; however, the basics should be there — especially this change and proper mapping of date fields on import.
          Hide
          freemarmoset Joseph E Banks added a comment - - edited

          +1 Seems like a pretty common use case for folks exporting from relational stores and importing to mongo. Surrounding something in quotes should indicate it's a string in a csv. Don't see that as a special case.

          Show
          freemarmoset Joseph E Banks added a comment - - edited +1 Seems like a pretty common use case for folks exporting from relational stores and importing to mongo. Surrounding something in quotes should indicate it's a string in a csv. Don't see that as a special case.

            People

            • Assignee:
              Unassigned
              Reporter:
              spencer Spencer Brody
            • Votes:
              18 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated: