Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8951

Add $findChar or $indexOf operator for strings to find position of specific character (or substring)

    • Fully Compatible
    • Query 12 (04/04/16), Query 13 (04/22/16), Query 14 (05/13/16)

      Syntax

      {$indexOfBytes: [<string>, <search value>, <start index - optional>, <end index - optional>]}
      {$indexOfCP: [<string>, <search value>, <start index - optional>, <end index - optional>]}
      {$indexOfArray: [<array>, <search value>, <start index - optional>, <end index - optional>]}
      

      Examples

      > db.coll.insert([
        {_id: 1, string: "hello world"}
      ]);
      > db.coll.aggregate([{
        $project: {
          location: {$indexOfBytes: ["$string", "world"]}
        }
      }]);
      {_id: 1, location: 6}
      
      // Example 2 - differentiating code points vs. bytes.
      > db.coll.insert([
        {_id: 1, string: "øle"}
      ]);
      > db.coll.aggregate([{
        $project: {
          byteLocation: {$indexOfBytes: ["$string", "le"]},
          cpLocation: {$indexOfCP: ["$string", "le"]}
        }
      }]);
      {_id: 1, byteLocation: 2, cpLocation: 1}
      
      // Example 3 - using the start index.
      > db.coll.insert([
        {_id: 1, string: "PREFIX|text with word FIX"},  // Contains "FIX", Should match.
        {_id: 2, string: "PREFIX|text without target"}  // Should not match.
      ]);
      > db.coll.aggregate([{
        $project: {
          containsFix: {$indexOfCP: ["$string", "fix", {$strLenCP: "PREFIX|"}]}
        }
      }]);
      {_id: 1, containsFix: 22}
      {_id: 2, containsFix: -1}
      
      
      // Example 4 - using the start and end indices.
      > db.coll.insert([
        {_id: 1, string: "PREFIX|text with word FIX|SUFFIX"},  // Contains "FIX", Should match.
        {_id: 2, string: "PREFIX|text without target|SUFFIX"}  // Should not match.
      ]);
      > db.coll.aggregate([{
        $project: {
          containsFix: {
            $let: {
              vars: {
                startIndex: {$strLenCP: "PREFIX|"},  // 7
                endIndex: {$subtract: [0, {$strLenCP: "|SUFFIX"}]}  // -7
              },
              in: {$indexOfCP: ["$string", "fix", "$$startIndex", "$$endIndex"]}
            }
          }
        }
      }]);
      {_id: 1, containsFix: 22}
      {_id: 2, containsFix: -1}
      

      Notes

      • Same functionality as Python's find(). Returns -1 if there were no occurrences.

      Errors

      • For indexOfBytes/indexOfCP, if the first two arguments are not strings. For indexOfArray, if the first argument is not an array.
      • If either of the last two arguments are not integral.

      Old Description
      Would like to have some operator to find character (or substring) in string. Use case - normalizing strings like server names "server1.foo.bar" to short names via $substr operator (without knowing location of first '.')

            Votes:
            16 Vote for this issue
            Watchers:
            20 Start watching this issue

              Created:
              Updated:
              Resolved: