BSON decoding compiles Java Pattern instances: this is error prone and bad for performance

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Minor - P4
    • 3.0.0
    • Affects Version/s: 2.6.1
    • Component/s: API
    • None
    • None
    • Minor Change
    • None
    • None
    • None
    • None
    • None
    • None

      In BasicBSONCallback.gotRegEx, Pattern.compile is called on the regular expression that was stored in the BSON structure.

      There are two problems with this:

      1) JavaScript regular expression are not equivalent to Java regular expressions. This means that it is possible that this Pattern.compile would throw a PatternSyntaxException. (Which is not handled here!) This can be very bad, because it would mean that certain MongoDB documents that contain such regular exceptions would simply be un-retrievable via the Java driver.

      2) Pattern.compile is slow, and would currently be done for every single regular expression encountered by the Java driver. However, in many cases the user does not care to use the compiled regular expression as a Pattern instance, and is only interested in the string content of it. This slow regex compilation is thus superfluous.

      My recommendation is to have a new dedicated class for BSON regular expressions: org.bson.types.RegExp. This class would not normally compile the pattern into a Java Pattern, however it could offer a compile() method to attempt to do this compilation for users who want to use this pattern, with the documented caveat that they might get a PatternSyntaxException if the regular expression is not compatible with Java. (In which case they may want to use Rhino to handle the regular expression.)

      I posted something similar to this to the MongoDB mailing list, and was told that "most users would prefer to use Pattern." Well, OK for them, not great for users like me who do not want breakage or pay performance costs for a feature they do not use. May I suggest that this could be controlled with a flag? The code would look something like this:

      public void gotRegex( String name , String pattern , String flags ) {
      RegExp bsonRegExp = new RegExp(pattern , BSON.regexFlags(flags)); // This class does not compile the pattern, simply stores it
      _put(name, compileRegExpFlag ? bsonRegExp.compile() : bsonRegExp); // The compile() method returns a Java Pattern (though it may throw an exception)
      }

            Assignee:
            Unassigned
            Reporter:
            Tal Liron
            None
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: