A point [~michaelcahill] raised with respect to incremental backups.
By telling the application it can:
- get a list of log files in the home directory
- copy that list to the backup directory
- do a checkpoint in the home directory
- remove all but the last log file in the list, from the home directory
we're creating a backup strategy a future feature might break.
For example, if we were to write uncommitted changes to the log so changes can be larger than memory, or if information about a single checkpoint could physically appear in more than one log file, we might not be able to remove all of those log files.
A possible solution would be to add a new log_archive method to discard log files that are no longer needed from home directory. @sueloverso, is that an easy change to make?
Second, if the only remaining part of an incremental backup done outside of WiredTiger is the copy of the log files from the home directory to the backup directory, then it might be cleaner to add a flag/config/whatever to a backup cursor that only returned the current list of log files, that way the backup process becomes:
1. open a backup cursor, copy data and log files for a full backup
2. do a checkpoint
3. open a backup-incremental cursor, copy the log files for an incremental backup
4. call log-archive to remove log files
and, as usual, steps 2-4 can be repeated for incrementals.
We could also track the log files that have already been copied, but that would have to somehow be associated with calls to open backup cursors, which is a little confusing.
Finally, I'm not 100% sure of our recovery story with respect to crash recovery and incrementals. If we:
1. do a full backup,
2. do some number of incremental backups,
3. crash and recover the home directory after a crash,
4. do some number of incremental backups.
Can someone confirm we don't get into trouble there because we're copying the state of the log files before recovery is performed, and then continuing to do so after recovery is performed?