[SERVER-37575] BinData compare uses the Base64 string not the raw bytes Created: 11/Oct/18 Updated: 07/Apr/23 Resolved: 22/Oct/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.4.16 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Glen Miner | Assignee: | Asya Kamsky |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Linux x64 |
||
| Operating System: | ALL |
| Steps To Reproduce: | I can make 3 binary encoded IPv6 addresses like this:
You can confirm these sort properly a < b < c in hex like this:
And of course as you would expect the shell agrees:
However if I compare the objects themselves I get a peculiar answer:
After some digging this peculiar result seemed to match the string-compare of the base-64 encoded data:
Which tragically is inconsistent with the "byte-wise lexicographic sort" implied by the documentation. Anecdotal evidence suggests that server-side sorting has the same behavior. |
| Participants: |
| Description |
|
https://docs.mongodb.com/manual/reference/bson-type-comparison-order/#bindata Says MongoDB sorts BinData in the following order: .... Finally, by the data, performing a byte-by-byte comparison. This is actually a serious bug because it means you can't properly sequence binary data (like IPv6 addresses) in a way that supports range compares but I have no expectation that this will actually be fixed – I suspect instead that this will turn into a documentation bug that warns others off from using this datatype.
|
| Comments |
| Comment by Glen Miner [ 31/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
Thank you for investigating! I'm not sure if it's worth making a new bug or migrating this one for the shell compare operators – I don't care about that personally but it might be a trap people fall into like I did. I retraced my steps and I was seeing this with aggregate commands matching on $gte $lte bounds but I cannot reproduce this with the test data above. It's possible that a TTL index was removing items that made aggregates come back empty but I kind of doubt it. I will re-open when if I see it again. Thanks again! | ||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 22/Oct/18 ] | ||||||||||||||||||||||||||||||||||||||||
|
> Note that this this doesn't just apply to the shell; This only seems to apply to the shell - when storing these values in the server, I can only get correct results, I cannot reproduce any query or aggregation where comparison happens as a string nor returns unexpected/incorrect result:
I'm closing this ticket as "Cannot reproduce" but if you can find an example of data/query that returns the wrong results from the server please reopen and include any details you can about exact query so we can reproduce and fix it. |