Uploaded image for project: 'Compass '
  1. Compass
  2. COMPASS-8933

Invalid python regex codegeneration

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • No version
    • Affects Version/s: None
    • Component/s: Export to Language
    • None
    • Environment:
      OS: ubuntu 24
      MongoDB Compass version: 1.42.5
    • Not Needed
    • Developer Tools

      "Special" thanks for your "awesome" bug tracker, much more convenient than Github issues (no, no, no).

      --------------------------------------------------------------

      (main issue)

      (1)

      Issue: The first literal `\` in regex always escaped.

      Example:

      shell

      [
        {
          $match:
            /**
             * query: The query in MQL.
             */
            {
              field: /\.needle\./
            }
        }
      ]

      python

      [
          {
              '$match': {
                  'field': re.compile(r"\\.needle\.")
              }
          }
      ]

       

      Attention: re.compile(r"\\{}.needle\.")

      You can check with different inputs - the first `\` always escaped unnecessarily.

      I tried to find where the problem could be, starting from

      python regex type template

      but found only other issues

      --------------------------------------------------------------

      (2)

      Issue: regex flags are added to the end of the regex, but must be added to the start

      Code:

      const str = pattern + flags;

      https://github.com/mongodb-js/compass/blob/e41df6bfb6b5ee75146cc406e00deb07361105fb/packages/bson-transpilers/symbols/python/templates.yaml#L149C17-L149C37

      Fix: add to the start.

      It was deprecated long ago to be not on the start.

      From python3.11 it is officially changed::

      https://docs.python.org/3/library/re.html#regular-expression-syntax

       

      (?aiLmsux)
      (One or more letters from the set 'a', 'i', 'L', 'm', 's', 'u', 'x'.) The group matches the empty string; the letters set the corresponding flags for the entire regular expression:
      re.A (ASCII-only matching)
      re.I (ignore case)
      re.L (locale dependent)
      re.M (multi-line)
      re.S (dot matches all)
      re.U (Unicode matching)
      re.X (verbose)
      (The flags are described in Module Contents.) This is useful if you wish to include the flags as part of the regular expression, instead of passing a flag argument to the re.compile() function. Flags should be used first in the expression string.
      Changed in version 3.11: This construction can only be used at the start of the expression.
      

      --------------------------------------------------------------

      (3)

      Question: why `'...'` and `"..."` stripped from the pattern?

      https://github.com/mongodb-js/compass/blob/e41df6bfb6b5ee75146cc406e00deb07361105fb/packages/bson-transpilers/symbols/python/templates.yaml#L151-L155

      Is not, for example, 

      "needle"

      a valid regex?

      why to strip it?

      As I see,  this (3) point is related to all languages, not only Python.

      --------------------------------------------------------------

       

      It is all. Thanks for your attention!

            Assignee:
            Unassigned Unassigned
            Reporter:
            maxzhenzhera@gmail.com Max Zhenzhera
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: