[SERVER-50092] [FLE] with encryption on collection and $lookup with two non-encrypted collections fails Created: 04/Aug/20 Updated: 31/Jan/24 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Field Level Encryption |
| Affects Version/s: | 4.2.8 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Asya Kamsky | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 23 |
| Labels: | FLE, qopt-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
if you perform an aggregation pipeline $lookup from a MongoClient with encryption, where both of collections involved are not encrypted you'll get Pipeline over an encrypted collection cannot reference additional collections For example, encrypted collection A, non-encrypted collection B and C and you have pipeline from B : [\{$lookup:{from:"C", localField:"Foo", foreignField:"Bar", as:"Baz"}}] you still get the error above. |
| Comments |
| Comment by Sujan Maharjan [ 31/Jan/24 ] |
|
Any updates so far or any alternative solutions? This is really a BIG issue. |
| Comment by Angus Ryer [ 15/Nov/23 ] |
|
This has been an issue for such a long time. Is everyone that uses CSFLE simply tolerating managing two connections? Have they all moved to some other database provider or encryption mechanism? Or is everyone just encrypting fields explicitly? To me this seems like a bug, not an enhancement. Why should querying non-encrypted collections result in an error when CSFLE is enabled? No encryption mechanisms should be engaged after the query is parsed, since no encrypted fields would be found. |
| Comment by Ben Luk [ 25/Oct/22 ] |
|
Hello, any update on this? |
| Comment by Anthony LaMartina [ 09/Mar/22 ] |
|
I figured out a work around. Essentially I created a collection that is dedicated to encrypting single string via CSFLE that I want to use as a matching comparison in a Lookup. I insert the string with an auto-encryption client into the collection and then query it with a non-encryption client to get it in its encrypted state (Bson Binary). I then can join a collection in a Lookup that has encrypted strings with another collection that does not, by using a non-encrypted client for both. Then if I want to query the Lookup by an encrypted string I just need to match the Bson Binary data I created in the first collection mentioned above to the encrypted field in the collection that is in the lookup. This way I don't need the framework to compare the decrypted string with a search string. I just compare an encrypted string field with another temporarily encrypted string. |
| Comment by Anthony LaMartina [ 07/Mar/22 ] |
|
Are there any plans to release a fix for this in the near future, or is this backlogged indefinitely? I needed to convert over various fields in my database to be encrypted in order to be compliant with Facebook's Platform data policies. I went the CSFLE route only to find out in the end that I get this bug on my existing queries that are using lookups/joins because it appears that collections being accessed through an encryption client can not do this. So know I need to perform a refactor on my code to perform any encrypted collections separately from lookups. I am pretty disappointed to see that this was brought up two years ago and hasn't been fixed. Has anyone figured out a better alternative to MongoDB's CSFLE that plays nice with server side mongo queries (including joins)? |
| Comment by Tyler Elkink [ 16/Dec/20 ] |
|
I can understand that from the perspective of a successful NOSQL development team, joins being a "convenience" seems a reasonable position. I can understand that smoothing out gaps between joins and encryption is a high-effort, low-reward option. The problem from the user perspective is, I think, that MongoDB wrote $lookup, made it available in early 2018, and is trying to establish itself in an environment thoroughly dominated by SQL systems that use joins- and developers that are used to joins. And use joins, as they presume they are able to do so from the MongoDB documentation. If $lookup is a "convenience" that's not part of the core MongoDB offering, then so- as 3.2 releases- are basic math operations on aggregations. My company just ran head-first into this problem while trying to protect PII on tables we reference in aggregation. We now get to decide whether to rip $lookup out of all of our code and refactor, or to substantially increase the number of connections to a production environment (which, due to MongoDB's architectural decisions, are more limited than other DBs), or to hope this bug comes out of backlog and into active development soon. Like Tom says, handling connection logic based on field encryption is a huge pain, and like Tal says there's no good solution with this bug in place. I'm going to recommend we refactor, since developer time is more under our control and less firmly limited than the other options, but I hope this description illuminates to the devs why MongoDB users may have a perspective that doesn't include joins (and joins working with encryption) as a "convenience." |
| Comment by Tal Bar [ 24/Nov/20 ] |
|
Hello, any update on this? If this issue isn't going to be resolved in the near future, we are going to have to stop using CSFLE altogether. I have to say I'm surprised this issue isn't given a much higher priority.. it is a very serious bug IMHO. |
| Comment by Anu Madan [ 20/Oct/20 ] |
|
Hi @Asya, Yes, I understand it is complex if both collections require encryption. Thanks, |
| Comment by Tom Gabay [ 16/Oct/20 ] |
|
Hey @asya, |
| Comment by Asya Kamsky [ 16/Oct/20 ] |
|
anu.madan@idemia.com this is not a simple problem to address as a second collection requires knowing a different schema for encryption. $lookup is not really part of "basic" functionality in normal MongoDB querying - it's more a convenience added to allow you to avoid a second query in certain cases. I would say that if this is a blocker you should consider a different approach to querying the two collections (for instance performing two queries rather than doing a single aggregation with $lookup). |
| Comment by Anu Madan [ 15/Oct/20 ] |
|
Hi, Facing the same issue. Is there an expected date for the resolution of this issue? Thanks. |
| Comment by Tom Gabay [ 05/Aug/20 ] |
|
Hey, Also handling this issue, Just wanted to say i think its a big issue, in every real life scenario doing CSFLE require us to handle some complex logic to decide what connection to use, very un-natural way to use db connection |
| Comment by Asya Kamsky [ 04/Aug/20 ] |
|
This is a current limitation where using encrypted client assumes that it's necessary and doesn't allow $lookup into another collection. Current workaround would be to use a non-encrypted client for such aggregations. |