Version 11.1.0
Checkpoint-level durability

WiredTiger supports checkpoint durability by default, and optionally commit-level durability when logging is enabled. In most applications, commit-level durability impacts performance more than checkpoint durability; checkpoints offer basic operation durability across application or system failure without impacting performance (although the creation of each checkpoint is a relatively heavy-weight operation). See Commit-level durability for information on commit-level durability.

Checkpoints vs. snapshots

Here is a brief explanation of the terms "checkpoint" and "snapshot", as they are widely used in this manual. A checkpoint is an on-disk entity that captures the persistent state of some or all of the database, while a snapshot is a lightweight in-memory entity that captures the current state of pending updates in the cache. Isolation refers to snapshots, because isolation is about runtime state and which updates can be seen by other threads' transactions as they run. Durability refers to checkpoints, because durability is about on-disk persistence. The two concepts are closely connected, of course; when a checkpoint is created the code involved uses a snapshot to determine which updates should and should not appear in the checkpoint.

Checkpoints

A checkpoint is automatically created for each individual file whenever the last reference to a modified data source is closed.

Checkpoints of the entire database can be explicitly created with the WT_SESSION::checkpoint method. Automatic database-wide checkpoints can be scheduled based on elapsed time or data size with the wiredtiger_open checkpoint configuration. In this mode of operation, an internal server thread is created to perform these checkpoints.

All transactional updates committed before a checkpoint are made durable by the checkpoint, therefore the frequency of checkpoints limits the volume of data that may be lost due to application or system failure.

Data sources that are involved in an exclusive operation when the checkpoint starts, including bulk load, upgrade or salvage, will be skipped by the checkpoint.

When a data source is first opened, it appears in the same state it was in when it was most recently checkpointed. In other words, updates after the most recent checkpoint (for example, in the case of failure), will not appear in the data source at checkpoint-level durability. If no checkpoint is found when the data source is opened, the data source will appear empty.

Checkpointing specific objects

The WT_SESSION::checkpoint method supports checkpoint of a set of target objects (as opposed to a database-wide checkpoint), using the target configuration.

Warning
If objects are separately checkpointed from a database-wide checkpoint, it is obviously possible to introduce data consistency problems if the objects have related data. Additionally, because WiredTiger has a single database-wide file used to store data history and other transactional information related to the database objects, checkpointing a set of target objects without also checkpointing that database-wide store can also lead to data inconsistencies. In other words, it is difficult to safely use a targeted checkpoint in the current WiredTiger release. Future development directions for the WiredTiger library include splitting the database-wide history file into per-object files, which will then be automatically checkpointed any time a database object is checkpointed. For that reason, we anticipate checkpointing a set of objects will become useful again in the future, but extreme caution is warranted until those changes are made.

Checkpoint cursors

Cursors are normally opened in the live version of a data source. However, it is also possible to open a read-only, static view of the data source as of a checkpoint. This is done by passing the checkpoint configuration string to WT_SESSION::open_cursor. This provides a limited form of time-travel, as the static view is not changed by subsequent checkpoints and will persist until the checkpoint cursor is closed. Checkpoint cursors ignore the currently running transaction; they are (in a sense) their own transactions. When timestamps are in use, a checkpoint cursor reads at the time associated with the checkpoint, which is normally the stable timestamp as of the time the checkpoint was taken.

Checkpoint naming

Checkpoints that do not include LSM trees may optionally be given names by the application. Checkpoints named by the application persist until explicitly discarded or replaced with a new checkpoint by the same name. (If an application attempts to replace an existing checkpoint, and it cannot be removed, either because a cursor is reading from the previous checkpoint, or because backups are in progress, the new checkpoint will fail and the previous checkpoint will remain.) Because named checkpoints persist until discarded or replaced, they can be used to save the state of the data for later use.

Internal checkpoints, that is, checkpoints not named by the application, use the reserved name WiredTigerCheckpoint. (All checkpoint names beginning with this string are reserved.) Applications can open the most recent checkpoint (whether internal or named) by specifying WiredTigerCheckpoint as the checkpoint name to WT_SESSION::open_cursor.

The name "all" is also reserved as it is used when dropping checkpoints.

Warning
Applications wanting a consistent view of the data in two separate tables must either use named checkpoints or explicitly control when checkpoints are taken, as there is a race between opening the default checkpoints in two different tables and a checkpoint operation replacing the default checkpoint in one of those tables.

The -c option to the wt command line utility list command will list a data source's checkpoints, with time stamp, in a human-readable format.

Checkpoint durability and backups

Backups are done using backup cursors (see Backups for more information).

Warning
When applications are using checkpoint-level durability, checkpoints taken while a backup cursor is open are not durable. That is, if a crash occurs when a backup cursor is open, then the system will be restored to the most recent checkpoint prior to the opening of the backup cursor, even if later database checkpoints were completed. As soon as the backup cursor is closed, the system will again be restored to the most recent checkpoint taken.

Applications using commit-level durability retain durability via the write-ahead log even though checkpoints taken while a backup cursor is open are not durable. All log files are retained once the backup cursor is opened and, in the event of a crash, all operations will be replayed to provide durability.

Checkpoints and file compaction

Checkpoints share file blocks, and dropping a checkpoint may or may not make file blocks available for re-use, depending on whether the dropped checkpoint contained the last reference to those file blocks. Because named checkpoints are not discarded until explicitly discarded or replaced, they may prevent WT_SESSION::compact from reducing file size due to shared file blocks.