[SERVER-22104] $instr function to locate position of a "pattern" within a "string" Created: 08/Jan/16  Updated: 06/Dec/22  Resolved: 27/Feb/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor - P4
Reporter: Shakir Sadikali Assignee: Backlog - Query Team (Inactive)
Resolution: Duplicate Votes: 0
Labels: expression
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-8951 Add $findChar or $indexOf operator fo... Closed
Related
is related to SERVER-11947 Add a regex expression to the aggrega... Closed
Assigned Teams:
Query
Participants:
Case:

 Description   

This would be a corollary to the $substr operator. I have a case where a user would like to $project a portion of the document (i.e. 200 characters) surrounding a search string to present context to the client without having to do so in the application (because that would require sending the entire document result set back to the client).

This would make MongoDB a more viable product for their use-case.

https://docs.mongodb.org/manual/reference/operator/aggregation/#string-operators



 Comments   
Comment by Charlie Swanson [ 27/Feb/17 ]

Hi shakir.sadikali,

Yes, we are trying to avoid extending the find projection language. Aggregation's projection language already vastly exceeds the capabilities of the find projection language, and we have added many new expressions to the aggregation language without also adding them to the find projection language. We acknowledge that moving from the find command to the aggregate command may involve a drop in performance, but we are hoping to address this shortcoming in the coming release.

If the customer would be willing to adjust their query to use the new operator as requested, it doesn't seem overly onerous to request them to change their find into an aggregate instead. If this approach turns out to be slower/less efficient than the existing query, I unfortunately can't give any other advice than to wait for the performance of the aggregate command to catch up to that of the find command.

Thanks,
Charlie

Comment by Shakir Sadikali [ 17/Feb/17 ]

They are running find commands. Is there a reason we want to force users into using the aggregation framework for relatively trivial queries?

Comment by Charlie Swanson [ 15/Jan/16 ]

Okay, thanks for clarifying. I'm still going to close this as a duplicate, since it seems like you are requesting the same functionality as SERVER-8951, plus a bit more customization. I've added a comment to SERVER-8951 reflecting the other options you would like, so they will be taken into consideration when we design and implement SERVER-8951. Thanks!

Comment by Shakir Sadikali [ 15/Jan/16 ]

charlie.swanson Perhaps? I'm looking for the equivalent of the Oracle (gasp) INSTR function. The usecase is a simple one, i pass in STRING, SUBSTR, OCCURENCE and it returns the starting index of SUBSTR within STRING at OCCURENCE of SUBSTR. In practice, i would use INSTR and SUBSTR together to parse out portions of text from other strings. I re-opened because I wasn't sure what the protocol is. If you deem this to be a dup, please close and accept my apologies.

Comment by Charlie Swanson [ 15/Jan/16 ]

Yes, I believe this is a duplicate. shakir.sadikali, please elaborate on the differences if you there are any. I'll close this as a duplicate in the meantime.

Comment by Asya Kamsky [ 15/Jan/16 ]

This is dup of SERVER-8951 no?

Comment by Charlie Swanson [ 08/Jan/16 ]

Ah, okay. Yes, that very well may be the case. I investigated it a bit some time ago, and it looks like both python and javascript support extracting the position of the match (javascript example here). So I think it is possible.

Anyway, I think your request to have a separate expression just to get the index of an exact string match is a reasonable one, especially if the $regex syntax ends up being convoluted. I'll leave this open and report back when we decide on the syntax/features of a $regex expression.

Comment by Shakir Sadikali [ 08/Jan/16 ]

I was looking at regex documentation for linux / bash specifically. It does not support returning a specific position (as far as could find). If we were to implement such functionality it would deviate from the general behavior out in the wild (again, I'll be the first to admit I'm a regex novice and could have easily missed something).

Comment by Charlie Swanson [ 08/Jan/16 ]

shakir.sadikali, can you point me to the reading you've referenced? We haven't fully specified what a $regex expression would do, were we to make one, but I'm surprised by your assertion that it can't return the index of any matches, and curious why this is the case.

Comment by Shakir Sadikali [ 08/Jan/16 ]

It could be a viable solution if we could combine $substr and $regex (assuming regex can return the position, but all the reading I've done suggests that it can't) within the aggregation framework. However, for consistency and simplicity, having $instr as a companion function to $substr seems to make sense. Is there any additional information I can provide?

Comment by Charlie Swanson [ 08/Jan/16 ]

This is related to SERVER-11947. Depending on how that gets resolved, it could potentially address this use case. For example, if there was a $regex-like expression that could return the position in the string which it found the match, it could be used (in combination with $substr) to achieve this. I won't mark as an explicit duplicate though, since SERVER-11947 could be resolved without addressing this use case, or vice versa.

Generated at Thu Feb 08 03:59:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.