|
Hi Taelen,
GridFS is a driver specification for storing and retrieving large files in MongoDB. While GridFS stores files in two collections (files metadata and binary chunks), the MongoDB server is generally unaware of the relationship between documents in these collections. TTL indexes only apply to a single collection and require a date field in order to find expired documents to remove. The current GridFS API only specifies an uploadDate field for the files collection; chunks do not have any date information.
To automate expiry of GridFS documents within the current design, I would suggest writing an application/script which searches for expired GridFS files and removes these via the GridFS API. An expiry script could be scheduled to run periodically via cron or similar, and will have an equivalent outcome to a TTL index. An index on files.uploadDate should be added to support finding expired documents.
Alternatively, you could copy the uploadDate field from the files collection to the associated chunks after uploading a new document. This would allow TTL indexes to be set on both the files and chunks collections, but adds extra overhead (two new TTL indexes on the collections and an extra field on every chunk document) as compared to using the GridFS API to remove. The TTL indexes would also be independent, so this approach may result in errors if GridFS files are read close to their expiry and some chunks have already been deleted.
Regards,
Stephen
|