-
Type: Task
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Labels:
Re. TOOLS-67: mongoimport csv/tsv with specified types
For 3.4.0 release.
Most of the documentation that needs to be added will be regarding the last section in the following changeling (on the typed field input format):
overview of changes:
- two new command line options:
- 'columnsHaveTypes', boolean whether the field types are to be specified;
- 'parseGrace', how to handle type coercion failures (one of skipField,
skipRow, stop, autoCast). The skipRow option sends failed rows to
standard output.
- changed behavior:
- if 'columnsHaveTypes' is set, then wherever column names are specified,
the types will also be specified. If 'headerLine' is set, this refers
to the first line of the input source. If 'fieldFile' is set, this
refers to the file with the field names. If neither of these two
options is set, this refers to the string given in the 'fields'
option. - (behavior is unchanged if 'columnsHaveTypes' is not set)
- if 'columnsHaveTypes' is set, then wherever column names are specified,
- new typed field input format (from wherever column names are specified):
'<COLNAME>.<TYPE>(<ARG>)'- <TYPE> is one of: auto, binary, bool, date, date_go, date_ms,
date_oracle, double, int32, int64, string. The 'date' type is an alias
for 'date_go'. - <ARG> depends on the type. It can only be non-empty for the binary,
date, date_go, date_ms, and date_oracle types.- Three characters, if used within the argument, must be
backslash-escaped: '(', ')', and '\'. - For the 'binary' type, the argument is the encoding and must be one of
base64, base32, or hex. The base64 and base32 variants correspond to
the standard defined in RFC 4648. - For any of the 'date' types, the argument is a datetime layout string
corresponding to the following:- 'date' aliases to 'date_go'
- 'date_go' layouts correspond to the Golang time.Parse function
- 'date_ms' layouts correspond to the Microsoft SQL Server FORMAT function
- 'date_oracle' layouts correspond to the Oracle Database TO_DATE function
- Three characters, if used within the argument, must be
- examples:
- 'zipcode.string()'
- 'followerCount.int32()'
- 'user thumbnail.binary(base64)'
- 'verified.boolean()'
- 'misc.auto()'
- 'created.date_ms(MM/dd/yyyy hh:mm:ss)'
- <TYPE> is one of: auto, binary, bool, date, date_go, date_ms,
- documents
-
TOOLS-67 Give mongoimport of CSV/TSVs a way to specify import types
- Closed