Details
Description
With --smallfiles (and other options that reduce file sizes), Mongo will still allocate 16MB per database by default. For some use cases*, this is still way too big. By simply editing some hardcoded constants in the source code, I was able to reduce the default database size to a few kilobytes with no apparent loss of functionality. See:
https://github.com/kentonv/mongo/commit/14f391a000134e5d9d65bb14a6110e5a5b0be61d
Obviously, this patch is not suitable for merging. A real patch should be gated on a command-line flag. I would be happy to work on one, but would like direction from the Mongo team. Would a patch to add a new flag (--reallysmallfiles?) be accepted? Are there any glaring problems with the way I've approached this? How do you advise I move forward?
- My use case is sandstorm.io. Sandstorm is a platform for personal cloud apps. Each app instance runs in a sandbox, and the intent is that instances should be extremely fine-grained – e.g. when you use Etherpad on Sandstorm, each document runs in an independent Sandbox with its own server process and database. We only run an instance's server while it is in-use, which makes this level of granularity practical – as long as the storage footprint is appropriate.
Unsurprisingly, several Sandstorm apps use MongoDB. We ran into a problem where instances of these apps were taking up lots of disk space, even though each instance stored very little actual data. People would create a TODO list with ten items and it would end up being 16MB on disk – 10,000x what was actually needed.
Obviously, this isn't the use case MongoDB was intended to cover. Mongo is for huMONGOus data sets. However, in practice there are many reasons to choose Mongo other than scalability, and this can result in Mongo being used on very small data sets. E.g. any app that chooses to use Meteor (an excellent choice for a Sandstorm app) will probably use Mongo just because they integrate well. The problem was corrected when we asked developers to substitute my patched mongod, but that's a fairly high-maintenance solution for us as more developers come online.