[SERVER-25173] $substrBytes does not handle negative parameters correctly. Created: 20/Jul/16 Updated: 10/May/18 Resolved: 22/Mar/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 3.1.4 |
| Fix Version/s: | 3.7.4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Benjamin Murphy | Assignee: | Ian Boros |
| Resolution: | Done | Votes: | 2 |
| Labels: | asya | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Minor Change | ||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Query 2018-03-26 | ||||||||
| Participants: | |||||||||
| Description |
|
$substrBytes requires two arguments: starting index, and length. Parsing and validation occurs here. Both are required to be integral, however, we static cast the values to a size_t without checking to ensure they're nonnegative. At the very least, we should give an informative error message if they are negative. It is also conceivable that we want a negative value to mean "return the rest of the string", as in some other operators, in which case we'll need to add special handling. |
| Comments |
| Comment by Githook User [ 22/Mar/18 ] | |||||||||
|
Author: {'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}Message: | |||||||||
| Comment by Asya Kamsky [ 01/Feb/18 ] | |||||||||
|
Note for docs: currently the $substr (alias) is documented as: https://docs.mongodb.com/manual/reference/operator/aggregation/substr/#behavior If <start> is a negative number, $substr returns an empty string "". If <length> is a negative number, $substr returns a substring that starts at the specified index and includes the rest of the string. Once this bug is fixed, appropriate documentation updates should be made. To treat all negative numbers as errors would be backwards breaking for Bytes but it's already an error for CP so that would be consistent. To treat negative position as positions from the end would be consistent with other substring functions, but for length it does not seem consistent (it may be considered as length from the end?). | |||||||||
| Comment by Max Hirschhorn [ 31/Aug/16 ] | |||||||||
|
The "Invalid range, ending index is in the middle of a UTF-8 character" message has been able to be erroneously returned for certain negative lengths and non-ASCII strings since it was introduced in bc45142 as part of the changes from From the C99 standard:
Since -1 is equivalent to (2^64 - 1) mod 2^64, we have the following behavior when checking the value of str[lower + length].
| |||||||||
| Comment by Benjamin Murphy [ 20/Jul/16 ] | |||||||||
|
kay.kim mentioned this to me--her example:
|