Version 2.9.3
Backups in Java

WiredTiger cursors provide access to data from a variety of sources. One of these sources is the list of files required to perform a backup of the database. The list may be the files required by all of the objects in the database, or a subset of the objects in the database.

WiredTiger backups are "on-line" or "hot" backups, and applications may continue to read and write the databases while a snapshot is taken.

Backup from an application

  1. Open a cursor on the "backup:" data source, which begins the process of a backup.
  2. Copy each file returned by the Cursor.next method to the backup location, for example, a different directory. Do not reuse backup locations unless all files have first been removed from them, in other words, remove any previous backup information before using a backup location.
  3. Close the cursor; the cursor must not be closed until all of the files have been copied.

The directory into which the files are copied may subsequently be specified as an directory to the wiredtiger.open function and accessed as a WiredTiger database home.

Copying the database files for a backup does not require any special alignment or block size (specifically, Linux or Windows filesystems that do not support read/write isolation can be safely read for backups).

The database file may grow in size during the copy, and the file copy should not consider that an error. Blocks appended to the file after the copy starts can be safely ignored, that is, it is correct for the copy to determine an initial size of the file and then copy that many bytes, ignoring any bytes appended after the backup cursor was opened.

The cursor must not be closed until all of the files have been copied, however, there is no requirement the files be copied in any order or in any relationship to the Cursor.next calls, only that all files have been copied before the cursor is closed. For example, applications might aggregate the file names from the cursor and then list the file names as arguments to a file archiver such as the system tar utility.

During the period the backup cursor is open, database checkpoints can be created, but no checkpoints can be deleted. This may result in significant file growth.

Additionally, if a crash occurs during the period the backup cursor is open and logging is disabled, then the system will be restored to the most recent checkpoint prior to the opening of the backup cursor, even if later database checkpoints were created.

The following is a programmatic example of creating a backup:

Cursor cursor;
String filename;
int ret = 0;
String databasedir = "/path/database";
String backdir = "/path/database.backup";
final String sep = File.separator;
try {
/* Create the backup directory. */
if (!(new File(backdir)).mkdir()) {
System.err.println(progname + ": cannot create backup dir: " +
backdir);
return false;
}
/* Open the backup data source. */
cursor = session.open_cursor("backup:", null, null);
/* Copy the list of files. */
while ((ret = cursor.next()) == 0 &&
(filename = cursor.getKeyString()) != null) {
String src = databasedir + sep + filename;
String dest = backdir + sep + filename;
java.nio.file.Files.copy(
new java.io.File(src).toPath(),
new java.io.File(dest).toPath(),
java.nio.file.StandardCopyOption.REPLACE_EXISTING,
java.nio.file.StandardCopyOption.COPY_ATTRIBUTES);
}
if (ret == wiredtiger.WT_NOTFOUND)
ret = 0;
if (ret != 0)
System.err.println(progname +
": cursor next(backup:) failed: " +
wiredtiger.wiredtiger_strerror(ret));
ret = cursor.close();
}
catch (Exception ex) {
System.err.println(progname +
": backup failed: " + ex.toString());
}

In cases where the backup is desired for a checkpoint other than the most recent, applications can discard all checkpoints subsequent to the checkpoint they want using the Session.checkpoint method. For example:

ret = session.checkpoint("drop=(from=June01),name=June01");

Backup from the command line

The wt backup command may also be used to create backups:

rm -rf /path/database.backup &&
mkdir /path/database.backup &&
wt -h /path/database.source backup /path/database.backup

Incremental backup

Once a backup has been done, it can be rolled forward incrementally by adding log files to the backup copy. Adding log files to the copy decreases potential data loss from switching to the copy, but increases the recovery time necessary to switch to the copy. To reset the recovery time necessary to switch to the copy, perform a full backup of the database. For example, an application might do a full backup of the database once a week during a quiet period, and then incrementally copy new log files into the backup directory for the rest of the week. Incremental backups may also save time when the tables are very large.

Bulk-loads are not commit-level durable, that is, the creation and bulk-load of an object will not appear in the database log files. For this reason, applications doing incremental backups after a full backup should repeat the full backup step after doing a bulk-load to make the bulk-load durable. In addition, incremental backups after a bulk-load can cause recovery to report errors because there are log records that apply to data files which don't appear in the backup.

By default, WiredTiger automatically removes log files no longer required for recovery. Applications wanting to use log files for incremental backup must first disable automatic log file removal using the log=(archive=false) configuration to wiredtiger.open.

The following is the procedure for incrementally backing up a database and removing log files from the original database home:

  1. Perform a full backup of the database (as described above).
  2. Open a cursor on the "backup:" data source, configured with the "target=(\"log:\")" target specified, which begins the process of an incremental backup.
  3. Copy each log file returned by the Cursor.next method to the backup directory. It is not an error to copy a log file which has been copied before, but care should be taken to ensure each log file is completely copied as the most recent log file may grow in size while being copied.
  4. If all log files have been successfully copied, archive the log files by calling the Session.truncate method with the URI log: and specifying the backup cursor as the start cursor to that method. (Note there is no requirement backups be coordinated with database checkpoints, however, an incremental backup will repeatedly copy the same files, and will not make additional log files available for archival, unless there was a checkpoint after the previous incremental backup.)
  5. Close the backup cursor.

Steps 2-5 can be repeated any number of times before step 1 is repeated. Full and incremental backups may be repeated as long as the backup database directory has not been opened and recovery run. Once recovery has run in a backup directory, you can no longer back up to that database directory.

An example of opening the backup data source for an incremental backup:

/* Open the backup data source for incremental backup. */
cursor = session.open_cursor("backup:", null, "target=(\"log:\")");

Backup and O_DIRECT

Many Linux systems do not support mixing O_DIRECT and memory mapping or normal I/O to the same file. If O_DIRECT is configured for data or log files on Linux systems (using the wiredtiger_open direct_io configuration), any program used to copy files during backup should also specify O_DIRECT when configuring its file access. Likewise, when O_DIRECT is not configured by the database application, programs copying files should not configure O_DIRECT.