Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8951

Add $findChar or $indexOf operator for strings to find position of specific character (or substring)

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Sprint:
      Query 12 (04/04/16), Query 13 (04/22/16), Query 14 (05/13/16)

      Description

      Syntax

      {$indexOfBytes: [<string>, <search value>, <start index - optional>, <end index - optional>]}
      {$indexOfCP: [<string>, <search value>, <start index - optional>, <end index - optional>]}
      {$indexOfArray: [<array>, <search value>, <start index - optional>, <end index - optional>]}
      

      Examples

      > db.coll.insert([
        {_id: 1, string: "hello world"}
      ]);
      > db.coll.aggregate([{
        $project: {
          location: {$indexOfBytes: ["$string", "world"]}
        }
      }]);
      {_id: 1, location: 6}
       
      // Example 2 - differentiating code points vs. bytes.
      > db.coll.insert([
        {_id: 1, string: "øle"}
      ]);
      > db.coll.aggregate([{
        $project: {
          byteLocation: {$indexOfBytes: ["$string", "le"]},
          cpLocation: {$indexOfCP: ["$string", "le"]}
        }
      }]);
      {_id: 1, byteLocation: 2, cpLocation: 1}
       
      // Example 3 - using the start index.
      > db.coll.insert([
        {_id: 1, string: "PREFIX|text with word FIX"},  // Contains "FIX", Should match.
        {_id: 2, string: "PREFIX|text without target"}  // Should not match.
      ]);
      > db.coll.aggregate([{
        $project: {
          containsFix: {$indexOfCP: ["$string", "fix", {$strLenCP: "PREFIX|"}]}
        }
      }]);
      {_id: 1, containsFix: 22}
      {_id: 2, containsFix: -1}
       
       
      // Example 4 - using the start and end indices.
      > db.coll.insert([
        {_id: 1, string: "PREFIX|text with word FIX|SUFFIX"},  // Contains "FIX", Should match.
        {_id: 2, string: "PREFIX|text without target|SUFFIX"}  // Should not match.
      ]);
      > db.coll.aggregate([{
        $project: {
          containsFix: {
            $let: {
              vars: {
                startIndex: {$strLenCP: "PREFIX|"},  // 7
                endIndex: {$subtract: [0, {$strLenCP: "|SUFFIX"}]}  // -7
              },
              in: {$indexOfCP: ["$string", "fix", "$$startIndex", "$$endIndex"]}
            }
          }
        }
      }]);
      {_id: 1, containsFix: 22}
      {_id: 2, containsFix: -1}
      

      Notes

      • Same functionality as Python's find(). Returns -1 if there were no occurrences.

      Errors

      • For indexOfBytes/indexOfCP, if the first two arguments are not strings. For indexOfArray, if the first argument is not an array.
      • If either of the last two arguments are not integral.

      Old Description
      Would like to have some operator to find character (or substring) in string. Use case - normalizing strings like server names "server1.foo.bar" to short names via $substr operator (without knowing location of first '.')

        Issue Links

          Activity

          Hide
          ashevchuk Andrew Shevchuk added a comment - - edited

          Would be nice to have a string occurrence operator(less expensive than regex), like this:
          db.coll.find( { fullName :

          { $substr : "server1" }

          } ).

          Show
          ashevchuk Andrew Shevchuk added a comment - - edited Would be nice to have a string occurrence operator(less expensive than regex), like this: db.coll.find( { fullName : { $substr : "server1" } } ).
          Hide
          shakir.sadikali Shakir Sadikali added a comment -

          I'd like this added to more than the aggregation framework. In the general case, 2 functions

          1. returns the position of "substring" within "string"
          2. returns the "substring" within "string" starting at position A and ending at position B

          Have this work both in the aggregation framework and in "normal" queries (optionally support regex-like syntax?)

          This would be ideal, at least in the use-cases I've run into.

          Show
          shakir.sadikali Shakir Sadikali added a comment - I'd like this added to more than the aggregation framework. In the general case, 2 functions returns the position of "substring" within "string" returns the "substring" within "string" starting at position A and ending at position B Have this work both in the aggregation framework and in "normal" queries (optionally support regex-like syntax?) This would be ideal, at least in the use-cases I've run into.
          Hide
          asya Asya Kamsky added a comment -

          This ticket is for expression for aggregation framework. $regex already exists and works in queries. If there is new functionality that's needed for queries that can't be done with $regex please open a new SERVER ticket with description of the use case/example of query that would be needed.

          Show
          asya Asya Kamsky added a comment - This ticket is for expression for aggregation framework. $regex already exists and works in queries. If there is new functionality that's needed for queries that can't be done with $regex please open a new SERVER ticket with description of the use case/example of query that would be needed.
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'benjaminmurphy', u'name': u'Benjamin Murphy', u'email': u'benjamin_murphy@me.com'}

          Message: SERVER-8951 Aggregation now supports the indexOfArray, indexOfBytes, and indexOfCP expressions.
          Branch: master
          https://github.com/mongodb/mongo/commit/7ae631410d8ffe71c74f96d5ab5dd408764b7858

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'benjaminmurphy', u'name': u'Benjamin Murphy', u'email': u'benjamin_murphy@me.com'} Message: SERVER-8951 Aggregation now supports the indexOfArray, indexOfBytes, and indexOfCP expressions. Branch: master https://github.com/mongodb/mongo/commit/7ae631410d8ffe71c74f96d5ab5dd408764b7858
          Hide
          benjamin.murphy Benjamin Murphy (Inactive) added a comment -

          This ticket introduces $indexOfArray, $indexOfBytes, and $indexOfCP, with syntax in the description. It will need to be documented, and any drivers that support aggregation helpers will need to be updated to support it.

          Show
          benjamin.murphy Benjamin Murphy (Inactive) added a comment - This ticket introduces $indexOfArray, $indexOfBytes, and $indexOfCP, with syntax in the description. It will need to be documented, and any drivers that support aggregation helpers will need to be updated to support it.

            People

            • Votes:
              16 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                  Agile