[SERVER-50644] use non-ephemeral key to encrypt temporary files for the Sorter Created: 31/Aug/20  Updated: 29/Oct/23  Resolved: 20/Jan/21

Status: Closed
Project: Core Server
Component/s: Security
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Improvement Priority: Minor - P4
Reporter: Benety Goh Assignee: Varun Ravichandran
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-50479 add resumable index build support for... Closed
Related
is related to SERVER-42345 Use internal temporary collection to ... Backlog
Backwards Compatibility: Fully Compatible
Sprint: Security 2020-09-21, Security 2020-10-05, Security 2021-01-11, Security 2021-01-25
Participants:

 Description   

The Sorter encrypts the temporary files in dbpath/_tmp using a key that is not guaranteed to remain stable across process restarts. If it is feasible to use a persistent key to encrypt these temporary files, it would allow us to re-use these temporary files during the recovery process at startup.



 Comments   
Comment by Githook User [ 20/Jan/21 ]

Author:

{'name': 'Varun Ravichandran', 'email': 'varun.ravichandran@mongodb.com', 'username': 'varunravi98'}

Message: SERVER-50644, SERVER-50479: Add resumable index build support for ESE by using persistent key for Sorter temp file encryption
Branch: master
https://github.com/mongodb/mongo/commit/45a54bbac81ff1146f307afb2d04c94c694a1163

Comment by Githook User [ 20/Jan/21 ]

Author:

{'name': 'Varun Ravichandran', 'email': 'varun.ravichandran@mongodb.com', 'username': 'varunravi98'}

Message: SERVER-50644, SERVER-50479: Add resumable index build support for ESE by using persistent key for Sorter temp file encryption
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/9c35b6ea220aac23614a688da7f1ffbe85d7e637

Comment by Benety Goh [ 17/Sep/20 ]

spencer.jackson, collections cannot currently move across databases without getting a new collection UUID, so using the database name should be fine.

david.percy, I wasn't aware of SERVER-42345 until you mentioned it. In this context, we're mostly working with the existing file format specific to the Sorter.

Comment by Spencer Jackson [ 17/Sep/20 ]

benety.goh, ESE's keystore predates the collection UUIDs, so its persisted data structures maps each database name to the AES key to use for all collections in the database. Reaping keys is also hard, and currently requires an initial sync, so I'd prefer to avoid making too many durable but briefly used keys. Would it be possible to, at some point before or during the life of the SortedFileWriter, transform the collection's UUID into the name of the database that contains it?

Comment by Benety Goh [ 17/Sep/20 ]

spencer.jackson, I would also lean towards the more desirable solution. The collection UUID should always be available at the start of the index build to plumb into the SortedFileWriter. This collection UUID would be preferable to a namespace string or database name. Non-trivial index builds, which includes those eligible for restoration after a reboot, will also have access to a UUID that identifies the original createIndex request across the replica set.

Comment by Spencer Jackson [ 15/Sep/20 ]

I think there's a couple of ways to approach this.

The easy but less ideal solution would be to reserve a single non-ephemeral key for spilled index builds. No additional state would need to be plumbed, but every index build's spilled files would wind up being encrypted with the same keys. This would be less than ideal in multi-tenant environments, where customers prefer to have unique keys.

An alternative with more desirable runtime behaviour would require plumbing down the namespace which the index is being built against down to the Sorter, where we encrypt and decrypt the spilled data. If we did this, we could use the per-database encryption key to protect the spilled index data. This is desirable because that key is already protecting the at-rest data, and will only be used for a single customer in multi-tenant environments.

benety.goh, I haven't interacted with the SortedFileWriter recently, so I'd like to check with you: Would it be objectionable to plumb a namespace string or database down into it, so that it could provide it to the encryption subsystem for key selection? Would every SortedFileWriter be able to have access to a namespace string? Or are there some contexts, like restoration after a reboot, where the name wouldn't be available? Are there some situations where the SortedFileWriter would be expected to not know the namespace it's operating against?

Comment by Ian Whalen (Inactive) [ 31/Aug/20 ]

During investigation: see whether this is something we can easily accomodate using persisted keys in the keystore.

Comment by Benety Goh [ 31/Aug/20 ]

This ticket is motivated by an issue we found while testing the resumable index project. Reproduction steps can be found in SERVER-50479.

Generated at Thu Feb 08 05:23:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.