[JAVA-2293] Collection caching in DBApiLayer.doGetCollection(…) is broken Created: 31/Aug/16  Updated: 19/Sep/16  Resolved: 13/Sep/16

Status: Closed
Project: Java Driver
Component/s: API
Affects Version/s: 3.3.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Oliver Gierke Assignee: Ross Lawley
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

DBApiLayer.doGetCollection(…) looks up DBCollection items from a cache based on name. However, DBCollection exposes methods to mutate the state of the returned instance which makes those state changes propagate to other (potentially concurrent) clients requesting a collection.

In practice that means that a client e.g. setting a read preference on the collection, is setting it globally for all other clients that want to work with the same collection, although they might want to use a different read preference.

The client side workaround is effectively synchronizing the access to a collection and making sure all mutable properties are always set.

The original issue popped up with Spring Data MongoDB users that wanted to use different read preferences with different MongoTemplate instance.



 Comments   
Comment by Jeffrey Yemin [ 19/Sep/16 ]

Since findAndModify writes as well as reads, it always has to run on the primary.

Comment by Oliver Gierke [ 19/Sep/16 ]

Any suggestion what to do for e.g. findAndModify(…) there doesn't seem to be an overload taking a read preference.

Comment by Jeffrey Yemin [ 13/Sep/16 ]

Yes, that's correct. The DBCursor's properties don't get "frozen" until the application starts treating it as an Iterator by calling next() or hasNext() on it.

Comment by Oliver Gierke [ 13/Sep/16 ]

Thanks for the feedback, Jeff. Just to make sure I get you right: you're basically suggesting to set the read preference on the DBCursor that e.g. a find(…) operation returns? I so far didn't consider that an option as I assumed, that as soon as I get hold of a DBCursor the query has actually been already executed and it's basically too late to tweak any preferences that might influence the execution of it. Am I right that you're suggesting that would be a reasonable workaround?

Comment by Jeffrey Yemin [ 13/Sep/16 ]

Closing as Won't Fix, but I will re-open if necessary based on subsequent comments.

Comment by Jeffrey Yemin [ 31/Aug/16 ]

Users may be relying on the fact that DBCollection instances are cached and the cached instances retain the property settings. Changing this will break anyone relying on that.

// One part of the app
client.getDB("test").getCollection("test").setReadPreference(secondary());
 
// some other part of the app with access to the client
client.getDB("test").getCollection("test").findOne();  // assumes that read preference has been set appropriately somewhere else

I also don't think we should add an alternative getCollection method. It would make reasoning about the behavior difficult, as one would not know, given a DBCollection instance of unknown provenance, whether it was shared or not. And if we did do that, then for consistency we'd have to do the same for DB instances.

If MongoTemplate fully wraps the DBCollection, a possible workaround would be to set the read preference on the MongoTemplate and then specify it for every operation instead of as a default on the DBCollection.

Comment by Oliver Gierke [ 31/Aug/16 ]

There are, but for backwards compatibility reasons we can't move away from the existing APIs until a "Spring Data 2.0". We already have bits of the migration in place but ultimately we're not going to release that before a Spring 5.0 in March 2017.

I am kind of wondering what compatibility issues you imagine if the call returned a fresh DBCollection for each invocation. Assuming equals(…) and hashCode() are implemented correctly, it shouldn't make any difference. It's of course up to you what to do with the issue but a decision not to do anything about it would basically render concurrent collection access with different read preferences unusable on that driver generation. I'd be forced to close the upstream bug as works as designed then.

Comment by Ross Lawley [ 31/Aug/16 ]

Hi oliver.gierke,

Thanks for the ticket, you're correct the DBCollection is mutable and cached which I appreciate can cause complications. The reasoning for the design decision is from before my time on the Java driver. However, changing it would mean that it would break the existing behaviour that some users rely on.

As the DB and DBCollection api's are deprecated (via the access point MongoClient#getDB), I'm leaning towards closing this ticket as "Won't Fix". Especially, as the newer MongoDatabase and MongoCollection API's (accessible via MongoClient#getDatabase) are immutable and don't suffer from this issue. Are there any plans to migrate to the new API's? A good motivation might be access to new features coming in forthcoming MongoDB releases will be accessible via the new API.

Ross

Generated at Thu Feb 08 08:56:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.