Core Server
  1. Core Server
  2. SERVER-3731

Give mongoimport of CSV/TSVs a way to specify import types

    Details

    • Backport:
      No
    • # Replies:
      7
    • Last comment by Customer:
      true

      Description

      When using mongoimport to import a CSV, the field "0123" will get imported as the number 123 no matter what, there's no way to get mongoimport to treat it as a String.

      It could be nice to have an option to specify what type each field should be interpreted as. This could also be useful if we start supporting importing types beyond just strings and numbers (dates, for example).

        Issue Links

          Activity

          Hide
          Scott Hernandez
          added a comment -

          This sounds like a great place to write your own import script (like with python/ruby/perl/etc) instead of the tool doing it.

          Show
          Scott Hernandez
          added a comment - This sounds like a great place to write your own import script (like with python/ruby/perl/etc) instead of the tool doing it.
          Hide
          Alan Gruskoff
          added a comment -

          Same for me.
          Picking up the leading zero in Zip Codes is often a problem. In
          trying to get mongoimport to ingest a CSV file of Zip Codes, I found
          that the -type csv mode does not respect double quotes around a string
          like "01001" instead posting 1001 as a number.

          CSV Import of: "01001","Agawam","MA"
          results in this:

          { "_id" : 1001, "city" : "Agawam", "state" : "MA" }

          We need a way to specify an imported field is String or Decimal with number of Decimal places. Perhaps a Field : Type map the user could specify.

          Show
          Alan Gruskoff
          added a comment - Same for me. Picking up the leading zero in Zip Codes is often a problem. In trying to get mongoimport to ingest a CSV file of Zip Codes, I found that the -type csv mode does not respect double quotes around a string like "01001" instead posting 1001 as a number. CSV Import of: "01001","Agawam","MA" results in this: { "_id" : 1001, "city" : "Agawam", "state" : "MA" } We need a way to specify an imported field is String or Decimal with number of Decimal places. Perhaps a Field : Type map the user could specify.
          Hide
          Graham Hargreaves
          added a comment -

          same here, we have seen an issue where if the field in the csv is null the type gets set to 6 which is not even defined in the docs e.g.

          > db.messagingProfile.findOne({inDate: {$type: 18},inDate: null},

          {inDate:1}

          )

          { "_id" : ObjectId("4edc1d3ac3e40219507a5ec2"), "inDate" : null }

          > db.messagingProfile.findOne({inDate: {$type: 6},inDate: null},

          {inDate:1}

          )

          { "_id" : ObjectId("4edc1d3ac3e40219507a5ec2"), "inDate" : null }

          As the type 6 isn't defined from what I can see I am going to raise a separate case

          Show
          Graham Hargreaves
          added a comment - same here, we have seen an issue where if the field in the csv is null the type gets set to 6 which is not even defined in the docs e.g. > db.messagingProfile.findOne({inDate: {$type: 18},inDate: null}, {inDate:1} ) { "_id" : ObjectId("4edc1d3ac3e40219507a5ec2"), "inDate" : null } > db.messagingProfile.findOne({inDate: {$type: 6},inDate: null}, {inDate:1} ) { "_id" : ObjectId("4edc1d3ac3e40219507a5ec2"), "inDate" : null } As the type 6 isn't defined from what I can see I am going to raise a separate case
          Hide
          Pawel Krupinski
          added a comment -

          Same here.
          I needed to update some rows with a new property, but since my ids were strings Mongo created new records instead of updating them.
          CSV is a format common enough so that mongo should allow easy import from it.

          Show
          Pawel Krupinski
          added a comment - Same here. I needed to update some rows with a new property, but since my ids were strings Mongo created new records instead of updating them. CSV is a format common enough so that mongo should allow easy import from it.
          Hide
          Deyna Cegielski
          added a comment -

          Is there any work around for this?

          I've tried wrapping numerical string values in quotes and escaping them but they just end up as a string containing the quotes e.g. ""010"".

          Show
          Deyna Cegielski
          added a comment - Is there any work around for this? I've tried wrapping numerical string values in quotes and escaping them but they just end up as a string containing the quotes e.g. ""010"".
          Hide
          Spencer Brody
          added a comment -

          If you need different behavior around the handing of types than mongoimport provides, the best workaround is to write a script to do the import yourself. CSV is pretty straightforward to parse and there are many CSV parsing implementations available online. mongoimport is only designed to be used in the very simple, straightforward cases where no special handling of types is required.

          Show
          Spencer Brody
          added a comment - If you need different behavior around the handing of types than mongoimport provides, the best workaround is to write a script to do the import yourself. CSV is pretty straightforward to parse and there are many CSV parsing implementations available online. mongoimport is only designed to be used in the very simple, straightforward cases where no special handling of types is required.
          Hide
          Rob Marscher
          added a comment -

          Sorry to comment on an old issue, but I agree that a CSV value of "001", should be imported as "001" and not converted to a integer with a value of 1. Feels like a bug to me and not a case of needing "special handling" of the type. Thanks.

          Show
          Rob Marscher
          added a comment - Sorry to comment on an old issue, but I agree that a CSV value of "001", should be imported as "001" and not converted to a integer with a value of 1. Feels like a bug to me and not a case of needing "special handling" of the type. Thanks.

            People

            • Votes:
              13 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Days since reply:
                26 weeks, 3 days ago
                Date of 1st Reply: