[CDRIVER-2453] Invalid bson returned in bulk operation reply in some cases Created: 05/Jan/18 Updated: 15/Dec/21 Resolved: 05/Jan/18 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | libmongoc |
| Affects Version/s: | 1.8.0, 1.9.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Thijs Cadier | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Seen on Ubuntu 14.04 64 bit and Mac OS Sierra. |
||
| Issue Links: |
|
||||||||||||||||
| Description |
|
When an document inserted in a bulk operation is invalid bson and also contains a certain UTF-8 string the bulk operation reply contains incorrectly truncated UTF-8 data. It looks like it can truncate in the middle of a code point. The returned bson is invalid in that case. I have a bson file to reliably reproduce this. It contains some customer data so I don't want to post it publicly, but it is not sensitive data so I'm happy to email it to you. |
| Comments |
| Comment by Thijs Cadier [ 05/Jan/18 ] | ||||||||||
|
Looks like that's it indeed, thanks. | ||||||||||
| Comment by A. Jesse Jiryu Davis [ 05/Jan/18 ] | ||||||||||
|
This is SERVER-24007. The BSON file you shared with me appears to be valid, to me. It includes a very long UTF-8 _id value. When I insert the BSON twice, the server returns an error message like: > dup key: "first 100 characters of long _id..." The point where the server trims the _id value in order to include it in the error message is the 128th byte (or something like that), but the server doesn't notice that it has trimmed the _id in the middle of a multibyte UTF-8 character. I'm going to close this ticket as a duplicate, feel free to reopen if I've made a mistake. | ||||||||||
| Comment by A. Jesse Jiryu Davis [ 05/Jan/18 ] | ||||||||||
|
Received. (I thought I could share this attachment as a private attachment to the Jira ticket, but Jira doesn't have this feature so I deleted the attachment again.) | ||||||||||
| Comment by Thijs Cadier [ 05/Jan/18 ] | ||||||||||
|
You have to insert it into the same collection twice by the way, the second time there will be a duplicate key error and this bug will trigger. | ||||||||||
| Comment by Thijs Cadier [ 05/Jan/18 ] | ||||||||||
|
Done | ||||||||||
| Comment by A. Jesse Jiryu Davis [ 05/Jan/18 ] | ||||||||||
|
Oh, interesting. Sure, email me the file. | ||||||||||
| Comment by Thijs Cadier [ 05/Jan/18 ] | ||||||||||
|
This is example code using the Rust wrapper around libmongoc to trigger this bug:
| ||||||||||
| Comment by Thijs Cadier [ 05/Jan/18 ] | ||||||||||
|
I double checked this and the situation is slightly different: The bson in the bulk operation is actually valid, but the reply returned is invalid bson anyway. Can I send the bson file that triggers this to you? | ||||||||||
| Comment by Thijs Cadier [ 05/Jan/18 ] | ||||||||||
|
That's a good point, I actually don't know if it's a server or a driver bug. The driver or server correctly rejects the bson. The bug is that the driver then returns a reply that contains invalid bson. In my mind the reply document should always be valid bson. Whether the source of the invalid bson is the driver or the server I cannot tell. | ||||||||||
| Comment by A. Jesse Jiryu Davis [ 05/Jan/18 ] | ||||||||||
|
Thanks. Do you think this is either expected behavior (inserting invalid BSON results in undefined behavior) or a server bug (it accepts invalid BSON and replies with invalid BSON) or a driver bug (we don't reject the invalid BSON)? |