[SERVER-38264] Dynamic data masking at query time Created: 27/Nov/18 Updated: 03/Jun/21 Resolved: 05/Dec/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Kevin Ha | Assignee: | Eric Sedor |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
OracleDB supports data redaction for data masking. Apache Ranger supports data masking for Hive. Is it possible to setup a filter function for a collection such that all query to this collection will have the returned records pass through this function for dynamic data masking ? This is an important feature when we need to prepare the data source for exploration by data scientists, but we cannot show them the sensitive data. Cloning the data for extra proprocessing for this purpose is too painful. |
| Comments |
| Comment by Paul Done [ 03/Jun/21 ] |
|
Also covered now in the Practical MongoDB Aggregations book in the chapter "Mask Sensitive Fields" at: https://www.practical-mongodb-aggregations.com/examples/intricate-examples/mask-sensitive-fields.html |
| Comment by Paul Done [ 13/Feb/21 ] |
|
In case it is useful, more examples of Data Masking in MongoDB are here: https://pauldone.blogspot.com/2021/02/mongdb-data-masking.html and https://github.com/pkdone/mongo-data-masking |
| Comment by George Mihailov [ 22/Jul/20 ] |
|
The new [$function|https://docs.mongodb.com/master/reference/operator/aggregation/function/] operator allows even more flexibility for this. |
| Comment by Eric Sedor [ 28/Nov/18 ] |
|
Hi Kevin, thanks for your patience. This sort of security is supported by MongoDB in a few ways. The most straightforward might be creating a View using $project to omit sensitive fields. Then, you can offer read permissions for this collection to your data scientist users without offering read permissions to the backing collection. To explore this in more detail, we'd recommend the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group. |
| Comment by Kevin Ha [ 27/Nov/18 ] |
|
Just got an idea:
MongoDB is opensourced. If I know which source files in the source code are responsible for touching the data right below the query layer, then maybe i can inject some code to check the required fields' path/hierarchy for a query and add my data masking code, and then build my own customized version of MongoDB ?
If the above idea works then I can set up this customized MongoDB instance as a member of a production MongoDB replicaset, and let the data scientists only query this MongoDB which has the data masking code applied.
|