Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-3995

Allow online backups with no configuration

    • Type: Icon: New Feature New Feature
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.0.0
    • Component/s: Storage
    • Labels:
      None
    • Environment:
      Windows, but perhaps Linux also

      I apologize if this is a duplicate request.

      According to Kristina and Michael's book "MongoDB: The Definitive Guide" (pages 121-125), creating a backup of a live database requires taking some MongoDB-specific steps. The server could be shut down, which will allow a point-in-time backup, or the backup could be made on a slave, or fsync could be used to stop all writes while the backup is in progress.

      In Windows, there is a better way to provide for online backups. Since Windows XP, Windows has included a feature called Volume Shadow Copy. This is supported by Windows Backup and by commercial third-party backup programs. SQL Server uses this feature to allow active databases to be backed up. From a little reading, I'm not sure if Linux's LVM is similar, so I'll leave that to more knowledgeable people.

      With Volume Shadow Copy, there is a bit of handshaking between participants and a file system class driver that does the heavy lifting. When a backup program wants to back up a disk, it first calls all of the programs that are registered as "writers". It asks each of these programs to flush all data to disk in such a way that the data is in a perfectly consistent state. The programs playing this game will then temporarily stop writing to the disk. Once all registered programs have signaled that they are ready, the backup program "creates a shadow copy" of the disk. This does some checkpointing, but mostly it sets a flag in the Volume Shadow Copy driver to tell it that it is now in charge of disk writes. Once the shadow copy driver is prepared, everyone is told that they can go back to writing to the disk as if a backup was not running. Every disk write that they make goes to the disk, but a copy of the old data is made first. The disk performance hit will only occur once for each sector that is affected. The creation of the backup can proceed, taking however long it takes, and the copy of the data that it stores will be whatever Mongo put on the disk when it was told that the backup was beginning. Since MongoDB already has the fsync capability, the main work will be in adding some COM/XML code to communicate with the Volume Shadow Copy service. This isn't trivial, but it may not be more than a few weeks work. The huge advantage of adding a feature like this is that it lowers the adoption bar for IT shops to add Mongo to their staple of databases. It meets the goal of being a self-configuring database and "does the right thing" by default with no user intervention.

      Adding features to the Windows version without adding the same features to the Linux/Mac version may go against the grain; I don't know what the thinking is. Perhaps LVM or another Linux feature or project has similar functionality. But I know that there is an opportunity to be a bit more IT-friendly in the Windows version, so it seems worth suggesting.

            Assignee:
            Unassigned Unassigned
            Reporter:
            tad Tad Marshall
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: