Uploaded image for project: 'MongoDB ETL Tools'
  1. MongoDB ETL Tools
  2. TOOLS-848

Can't handle some regexes

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 3.1.0
    • Fix Version/s: 3.0.6, 3.1.7
    • Component/s: bsondump, mongoexport
    • Labels:
      None
    • Environment:
      Linux x86_64
    • Backport Requested:
      v3.0
    • Sprint:
      Kernel Tools 8 08/28/15

      Description

      bsondump and mongoexport can't handle BSON Regexps (type 0x0B) with patterns that json.RegExp doesn't understand. Since the BSON spec states that Regexp patterns can be any cstring (null-terminated string of non-null bytes), these are valid BSON (and might be valid regexes for other regex engines).

      $ cat tools-regex-fail.json
      { "_id" : { "$oid" : "51193479ef23779f3d000cbd" }, "keywords_regex" : { "$regex" : "\\.", "$options" : "" } }
      $ /m/3.0.4/bin/mongoimport --port 6834 -d test -c test3 --drop tools-regex-fail.json
      2015-07-25T22:04:26.488+1000    connected to: localhost:6834
      2015-07-25T22:04:26.488+1000    dropping: test.test3
      2015-07-25T22:04:26.490+1000    imported 1 document
      $ /m/3.0.4/bin/mongodump --port 6834 -d test -c test3 -o tools-regex-fail-dump
      2015-07-25T22:06:36.283+1000    writing test.test3 to tools-regex-fail-dump/test/test3.bson
      2015-07-25T22:06:36.284+1000    writing test.test3 metadata to tools-regex-fail-dump/test/test3.metadata.json
      2015-07-25T22:06:36.285+1000    done dumping test.test3
      $ /m/3.0.4/bin/bsondump tools-regex-fail-dump/test/test3.bson
      2015-07-25T22:09:49.651+1000    unable to dump document 1: error converting doc to JSON: json: error calling MarshalJSON for type json.RegExp: invalid character '.' in string escape code
      2015-07-25T22:09:49.652+1000    1 objects found
      $ /m/3.0.4/bin/mongoexport --port 6834 -d test -c test3
      2015-07-25T22:12:04.947+1000    connected to: localhost:6834
      2015-07-25T22:12:04.948+1000    Failed: json: error calling MarshalJSON for type json.RegExp: invalid character '.' in string escape code
      $ /m/3.1.6/bin/bsondump tools-regex-fail-dump/test/test3.bson
      2015-07-25T22:41:18.989+1000    unable to dump document 1: error converting doc to JSON: json: error calling MarshalJSON for type json.RegExp: invalid character '.' in string escape code
      2015-07-25T22:41:18.989+1000    1 objects found
      $ /m/3.1.6/bin/mongoexport --port 6834 -d test -c test3
      2015-07-25T22:41:27.804+1000    connected to: localhost:6834
      2015-07-25T22:41:27.805+1000    Failed: json: error calling MarshalJSON for type json.RegExp: invalid character '.' in string escape code
      

      This is a regression from the 2.6 tools, which handle such regexes without problem. Which means that this is the workaround (unless a 3.0 feature is required, eg. SCRAM-SHA-1).

      $ /m/2.6.10/bin/bsondump tools-regex-fail-dump/test/test3.bson
      { "_id" : ObjectId( "51193479ef23779f3d000cbd" ), "keywords_regex" : /\\./ }
      1 objects found
      $ /m/2.6.10/bin/mongoexport --port 6834 -d test -c test3
      connected to: 127.0.0.1:6834
      { "_id" : { "$oid" : "51193479ef23779f3d000cbd" }, "keywords_regex" : { "$regex" : "\\.", "$options" : "" } }
      exported 1 records
      

      Other BSON types may have similar problems, if native golang objects can only represent a subset of the valid BSON values. (Looking over the other types, this seems unlikely.)

      There may also be similar problems in the other tools (although I haven't found any).

        Attachments

          Activity

            People

            • Assignee:
              shraya.ramani Shraya Ramani
              Reporter:
              kevin.pulo Kevin Pulo
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: