Version 2.4.1
Commit-level durability in Java

WiredTiger supports checkpoint durability by default, and optionally commit-level durability when logging is enabled. In most applications, commit-level durability impacts performance more than checkpoint durability; checkpoints offer basic operation durability across application or system failure without impacting performance (although the creation of each checkpoint is a relatively heavy-weight operation). See Checkpoint durability in Java for information on checkpoint durability.

Commit-level durability is implemented using a write-ahead log and is enabled using the log=(enabled) configuration to wiredtiger.open. When logging is enabled, WiredTiger writes records to the log for each transaction.

Transactions define which updates are made durable together see Transactions in Java for details. By default, log records are flushed to disk by Session.commit_transaction, ensuring that, once Session.commit_transaction returns successfully, updates performed by the transaction will be included in the database state regardless of application or system failure.

When the transactional log is enabled, calling wiredtiger.open automatically performs a recovery step when opening the database that applies whatever changes from the log are required to bring the database up to date with the most recent transactional state. This recovery step may require extensions be available when it runs (for example, collators and compression). Therefore, applications doing recovery must configure extensions with the extensions keyword to wiredtiger.open consistently whenever re-opening the database.

Recovery is required after the failure of any thread of control in the application, where the failed thread might have been executing inside of the WiredTiger library or open WiredTiger handles have been lost. In most applications, if any thread of control exits unexpectedly, the application will close and re-open the database.

Checkpoints

Checkpoints of the database should still be performed periodically when commit-level durability is configured, either explicitly from the application or periodically based on elapsed time or data size with the checkpoint configuration to wiredtiger.open.

Database checkpoints are necessary for two reasons: First, log files can only be archived after a checkpoint completes, and so the frequency of checkpoints determines the disk space required by log files. Second, checkpoints bound the time required for recovery to complete after application or system failure by limiting the log records that need to be processed.

Backups

With logging enabled, partial backups (backups where not all of the database objects are copied), may result in error messages during recovery, because data files referenced in the logs might not be found. Applications should either copy all objects and log files if commit-level durability of the copied database is required, or alternatively, copy only selected objects when backing up and not copy log files at all, then fall back to checkpoint durability when switching to the backup.

Bulk loads

Bulk-loads are not commit-level durable, that is, the creation and bulk-load of an object will not appear in the database log files. For this reason, applications doing incremental backups after a full backup should repeat the full backup step after doing a bulk-load to make the bulk-load durable. In addition, incremental backups after a bulk-load can cause recovery to report errors because there are log records that apply to data files which don't appear in the backup.

Log file archival

WiredTiger log files are named "WiredTigerLog.[number]" where "[number]" is a 10-digit value, for example, WiredTigerLog.0000000001". The log file with the largest number in its name is the most recent log file written. The log file size can be set using the log configuration to wiredtiger.open.

By default, WiredTiger automatically removes log files no longer required for recovery. Applications wanting to archive log files instead must disable log file removal using the log=(archive=false) configuration to wiredtiger.open.

Log files may be removed or archived after a checkpoint has completed, as long as there's not a backup in progress. Immediately after the checkpoint has completed, only the most recent log file is needed for recovery, and all other log files can be removed or archived. Note that there must always be at least one log file for the database.

Open log cursors prevents WiredTiger from automatically removing log files. Therefore, we recommend proactively closing log cursors when done with them. Applications manually removing log files should take care that no log cursors are opened in the log when removing files or errors may occur when trying to read a log record in a file that was removed.

Tuning commit-level durability

Group commit

WiredTiger automatically groups the flush operations for threads that commit concurrently into single calls. This usually means multi-threaded workloads will achieve higher throughput than single-threaded workloads because the operating system can flush data more efficiently to the disk. No application-level configuration is required for this feature.

Flush call configuration

By default, log records are flushed to disk before WT_SESSION::commit_transaction returns, ensuring durability at the commit. However, the durability guarantees can be relaxed to increase performance.

If transaction_sync=(enabled=false) is configured to wiredtiger_open, log records will be buffered in memory, and only flushed to disk by checkpoints or calls to WT_SESSION::commit_transaction with sync=true. (Note that any call to WT_SESSION::commit_transaction with sync=true will flush the log records for all committed transactions, not just the transaction where the configuration is set.) This provides the minimal guarantees, but will be significantly faster than other configurations.

If transaction_sync=(enabled=true), transaction_sync=(method) further configures the method used to flush log records to disk. By default, the configured value is fsync, which calls the operating system's fsync call (or fdatasync if available) as each commit completes.

If the value is set to dsync instead, the O_DSYNC or O_SYNC flag to the operating system's open call will be specified when the file is opened. (The durability guarantee of the fsync and dsync configurations are the same, and in our experience the open flags are slower, this configuration is only included for systems where that may not be the case.)

Finally, if the value is set to none, commit will call the operating system's write call before returning, but will not flush the write.

Here is the expected performance of durability modes, in order from the fastest to the slowest (and from the fewest durability guarantees to the most durability guarantees).

Durability ModeNotes
log=(enabled=false)checkpoint-level durability
log=(enabled),transaction_sync=(enabled=false)in-memory buffered logging configured; updates durable after checkpoint or after sync is set in WT_SESSION::commit_transaction
log=(enabled),transaction_sync=(enabled=true,method=none)logging configured; updates durable after application failure, but not after system failure
log=(enabled),transaction_sync=(enabled=true,method=fsync)logging configured; updates durable on application or system failure
log=(enabled),transaction_sync=(enabled=true,method=dsync)logging configured; updates durable on application or system failure