[JAVA-4725] BulkWriteResult#getUpserts won't return updated documents. Only inserted Created: 10/Sep/22 Updated: 30/Sep/22 Resolved: 30/Sep/22 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | Write Operations |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question | Priority: | Unknown |
| Reporter: | Almog Tavor | Assignee: | Jeffrey Yemin |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Description |
| Comments |
| Comment by Jeffrey Yemin [ 30/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
I think your understanding is correct. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Almog Tavor [ 30/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
I see. But I wonder about the default behavior. When I write a bulk that fails for some of the records, I will definitely get an exception, and I will be able to see which of the records got errors. I that's indeed the case (looks like it is from your code), then I think this behavior is fair enough. Was I correct about my understanding? | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jeffrey Yemin [ 28/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
It's not transactional by default, but you can wrap it in a transaction to make it so. You can start at https://www.mongodb.com/docs/manual/core/transactions/ to learn more about that. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Almog Tavor [ 28/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Hey. Thanks for the detailed example. I did learn rn about getWriteErrors and getWriteResult().getUpserts(). | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jeffrey Yemin [ 28/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
I thought a little bit more about your original question, and before proceeding with a server enhancement request, I want to make sure there is a more complete understanding of the capabilities that the server and driver already provide. Have a look at the test program, which I've annotated with comments explaining what's happening at every stage, and let me know if that changes your understanding at all.
| ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Almog Tavor [ 28/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Ok. I’ll open a correspondent issue for the SERVER. I think a flag may solve the use case of a super large update wouldn’t it? And anyway, for super large insert, shouldn’t this problem occur too? How come the server does return this information although there are risks? But in general I think that a flag of “getOperationInformation” would solve this (and maybe will enable sending even more data like the shard key and not just the _id), since on that case the risks will be at the responsibility of the user. What do you think? | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jeffrey Yemin [ 27/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
almogtavor@gmail.com the short answer is that it's not included in BulkWriteResult because it's not included in the reply from the server. There is nothing the driver can do here without a corresponding change to the server. We could consider adding it to the server reply and then exposing it in the driver API, but we'd have to be careful in how we do it. Consider, for example, an updateMany operation whose filter matches every document in a billion-document sharded collection. A reply containing the _id of every modified document is probably not what anyone wants, and it would also blow out our 16MB reply limit. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Almog Tavor [ 16/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
The issue you've pointed out is relevant here, but I don't understand the rationale behind it. Why doesn't BulkWriteResult include both updated ids as well as inserted ids? And if so, where can I find the updated ids? | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Almog Tavor [ 16/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
You've said that BulkWriteResult.getUpserts will return the IDs of documents inserted as a result of an upsert. But that's true only for cases where the document's ID didn't exist in the collection before the upsert operation. e.g. for cases where the upsert causes an update operation (since the document's ID already exists in the collection), the id won't appear on the BulkWriteResult.getUpserts function. That is exactly what the test that I've added shows. The getModifiedCount and getMatchedCount functions aren't useful to me, since I need the exact documents that have been upserted. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ashni Mehta [ 13/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Hey Almog, thanks for reaching out. BulkWriteResult.getUpserts will return the IDs of documents inserted as a result of an upsert. It seems like getModifiedCount and getMatchedCount could be useful for you. You can read more about those here: https://mongodb.github.io/mongo-java-driver/4.7/apidocs/mongodb-driver-core/com/mongodb/client/result/UpdateResult.html#getModifiedCount(). Additionally, we have a ticket related to this that'll involve updating this method's javadoc to clarify. Feel free to vote on it! | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Almog Tavor [ 13/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Have you found anything? |