[JAVA-2828] ChangeStreamIterable withDocumentClass throws exception because convertResultsCallback looks for resumeToken Created: 12/Apr/18 Updated: 27/May/22 Resolved: 04/Dec/18 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | Query Operations |
| Affects Version/s: | 3.6.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Benjamin Berman | Assignee: | Ross Lawley |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
macOS 10.13.4 |
||
| Issue Links: |
|
||||||||
| Server Compat: | 4.1 | ||||||||
| Description |
The core issue is in AsyncChangeStreamBatchCursor convertResultsCallback. It should not throw an exception if a resume token cannot be found (very surprising and undocumented). |
| Comments |
| Comment by Benjamin Berman [ 27/May/22 ] | ||||||||||||||||||||||||||||||||||||||
|
As a follow up on this issue, I ended up migrating the application to Postgresql and AWS RDS. I don't recommend MongoDB anymore and don't use it for new projects. | ||||||||||||||||||||||||||||||||||||||
| Comment by Ross Lawley [ 04/Dec/18 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi all, Thanks for the initial feedback and further updates / workarounds. As a result of this feedback and discussions between the driver and server teams Because of the planned changes to the server I'm marking this ticket as "Won't Fix". Ross | ||||||||||||||||||||||||||||||||||||||
| Comment by Gil Abrantes [ 24/Aug/18 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi, I just bumped into this issue while using the Python driver. A (very) dirty workaround is to use the $addFields to add the '_id' field.
Hope this helps someone.
| ||||||||||||||||||||||||||||||||||||||
| Comment by Benjamin Berman [ 18/Apr/18 ] | ||||||||||||||||||||||||||||||||||||||
|
Awesome, I appreciate the prompt responses. The watch functionality is working great all around and was a breeze to integrate in any case. | ||||||||||||||||||||||||||||||||||||||
| Comment by Ross Lawley [ 17/Apr/18 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi doctorpangloss, Thanks for the excellent analysis, the analysis across drivers shows the differences between static and dynamically typed languages. I'll correct the Java side to ensure the resume token is at least the correct type. I discussed this ticket yesterday with the JVM team and we agreed the Spec and documentation needs some improvements here. Regarding, the watch helper itself, I agree that there are known limitations and the presence of an _id field doesn't validate there being a resume token no matter the type. The feature is there to support resumability, which would fail if the _id isn't valid. I like the idea of allowing an option to disable the automatic resumabilty should users need it. I'll put forward this ticket and your comments and start the discussion with the drivers team about changing the Spec. Many thanks, Ross | ||||||||||||||||||||||||||||||||||||||
| Comment by Benjamin Berman [ 16/Apr/18 ] | ||||||||||||||||||||||||||||||||||||||
|
To be precise, it's not that it requires an _id, it requires an _id as an object:
You'll see that rawDocument.getDocument("_id") pretty much guarantees that unless your _id field is a document (i.e., an object, as opposed to object ID or string), a surprising exception will be thrown. This is in contrast to say, the node-mongodb-native driver, which will silently succeed as long as an _id is present (because it does not actually check if _id is a resume token): https://github.com/mongodb/node-mongodb-native/blob/0e42efb0c1a235cca7c00711e6f4c707267729bc/lib/change_stream.js#L336
Same with the python driver: https://github.com/mongodb/mongo-python-driver/blob/9d7b4c44cef702d830be38462a2ee9fd55cc1579/pymongo/change_stream.py#L88
This ticket was in response to an issue I was experiencing as a library author exposing the watch functionality to end users. It sounds like the appropriate behaviour given these limitations is to throw an exception if the user attempts to use $replaceRoot, $project or $redact in their pipelines for watch. So now, no matter what, at least two drivers need to change because they weren't tested correctly anyway. Now you have to schedule a review of all the drivers for this issue. Given that there's a real bug in these drivers (not actually checking that the _id field is a valid resume token), and you'll need to do the work anyway, consider taking the time to change the specification. I would suggest letting the user pass a path spec for the resume token into the options for a watch command, if it's super important that the driver be able to find a resume token. Or, allow users to disable the driver's management of resume tokens using watch (a pretty trivial feature to implement, considering almost everyone will be adapting the provided change cursors into their own implementation-specific stream or event handlers anyway). Or, the watch interface should accept a single argument for the $match pipeline operator, since a user could use $project to mess up the token / put in invalid data no matter what. Or put another way, embedding a resume token and mentioning the $replaceRoot aggregation pipeline step in your docs wasn't ever viable in the first place. | ||||||||||||||||||||||||||||||||||||||
| Comment by Ross Lawley [ 16/Apr/18 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi doctorpangloss, Thanks for the ticket, I understand your frustration and surprise. Currently it is a restriction of the watch helper to require a _id. The reason the resumeToken is required for both sync & async changeStream operations is to help provide a good user experience in the event of a network error. The Java driver adheres to the change-streams specification. This scenario is discussed in the notes and restrictions section:
In short removal of the resumeToken is possible but not supported through the watch API. As $changeStream is currently an aggregation pipeline stage, it is possible to bypass this limitation by using MongoCollection#aggregate directly, although it is not recommended. I would recommend using the changeStreamIterable#map to change the results once returned from the server, thus allowing for resumability and shaping the documents as required. I hope that helps, Ross | ||||||||||||||||||||||||||||||||||||||
| Comment by Benjamin Berman [ 12/Apr/18 ] | ||||||||||||||||||||||||||||||||||||||
|
I should clarify this means that `$replaceRoot` in the Java driver's `watch` pipeline will basically never work. |