Version 11.3.0
Log cursors

WiredTiger cursors provide access to data from a variety of sources, and one of these sources is the records in the transaction log files. Log files may not be present in every WiredTiger database, only databases that have been configured for logging using the log configuration for wiredtiger_open. In databases with log files, a log cursor provides access to the log records. Although log cursors are read-only, applications may store records in the log using WT_SESSION::log_printf. While the log cursor is open automatic log file removal, even if enabled, will not reclaim any log files.

Each physical WiredTiger log record represents one or more operations in the database. When a log record represents more than a single operation in the database, all of the operations in a log record will be part of the same transaction, however, there is no corresponding guarantee that all operations in a transaction must appear in the same log record.

The following examples are taken from the complete example program ex_log.c.

To open a log cursor on the database:

error_check(session->open_cursor(session, "log:", NULL, NULL, &cursor));

A log cursor's key is a unique log record identifier, plus a uint32_t operation counter within that log record. When a log record reflects something log record that is not a transaction, such as the start of a checkpoint, the operation counter returned for the key will be zero. When the log record maps to a transaction, the first log record returned, with an operation counter of zero, will be the entire log record. Then the WT_CURSOR::next call will step into the transaction and return the first individual operation within that transaction and each additional individual operation, adding one to the operation counter for each one. A transaction with a single operation will return two records related to that transaction.

The unique log record identifier maps to a WT_LSN data structure, which has two fields: WT_LSN::file, the log file identifier, and WT_LSN::offset, the offset of the log record in the log file.

Here is an example of getting the log cursor's key:

error_check(cursor->get_key(cursor, &log_file, &log_offset, &opcount));

The log cursor's value is composed of six fields:

  • a uint64_t transaction ID (set for commit records only, otherwise 0),
  • a uint32_t log record type
  • a uint32_t operation type (set for commit records only, otherwise 0)
  • a uint32_t file id (if applicable, otherwise 0)
  • the operation key (commit records only, otherwise empty)
  • the operation value

The transaction ID may not be unique across recovery, that is, closing and reopening the database may result in transaction IDs smaller than previously seen transaction IDs.

The log record and log operation types are taken from log_types; typically, the only record or operation type applications are concerned with is WT_LOGREC_MESSAGE, which is a log record generated by the application.

The file ID may not be unique across recovery, that is, closing and reopening the database may result in file IDs changing. Additionally, there is currently no way to map file IDs to file names or higher-level objects.

Here is an example of getting the log cursor's value:

cursor, &txnid, &rectype, &optype, &fileid, &logrec_key, &logrec_value));

For clarity, imagine a set of three log records:

  • the first recording an internal operation, say a checkpoint start,
  • the second committing a transaction with three operations,
  • the third committing a transaction with a single operation.

The log cursor's WT_CURSOR::next call will return a total of seven records. Here's an example of what it would look like:

  1. LSN=[1,1000], operation counter=0, transaction ID=0, value of checkpoint start
  2. LSN=[1,2000], operation counter=0, transaction ID=30, all operations of transaction 30
  3. LSN=[1,2000], operation counter=1, transaction ID=30, first operation of transaction 30
  4. LSN=[1,2000], operation counter=2, transaction ID=30, second operation of transaction 30
  5. LSN=[1,2000], operation counter=3, transaction ID=30, third operation of transaction 30
  6. LSN=[1,3000], operation counter=0, transaction ID=31, all operations of transaction 31
  7. LSN=[1,3000], operation counter=1, transaction ID=31, only operation of transaction 31

The first time the log cursor will return a key with a unique log ID, no transaction ID, and an operation counter of 0. The next six returns from the log cursor will have a common log ID, a common transaction ID, and operation counters starting at 0, returning the whole record and then starting at 1 and ending at 5 for each of the five individual operations. The next return from the log cursor will again have a unique log ID, a unique transaction ID, and an operation counter of 0. And the final return from the log cursor will have an operation counter of 1.

Here's a more complete example of walking the log and displaying the results:

static void
print_record(uint32_t log_file, uint32_t log_offset, uint32_t opcount, uint32_t rectype,
uint32_t optype, uint64_t txnid, uint32_t fileid, WT_ITEM *key, WT_ITEM *value)
printf("LSN [%" PRIu32 "][%" PRIu32 "].%" PRIu32 ": record type %" PRIu32 " optype %" PRIu32
" txnid %" PRIu64 " fileid %" PRIu32,
log_file, log_offset, opcount, rectype, optype, txnid, fileid);
printf(" key size %zu value size %zu\n", key->size, value->size);
if (rectype == WT_LOGREC_MESSAGE)
printf("Application Record: %s\n", (char *)value->data);
* simple_walk_log --
* A simple walk of the log.
static void
simple_walk_log(WT_SESSION *session, int count_min)
WT_CURSOR *cursor;
WT_ITEM logrec_key, logrec_value;
uint64_t txnid;
uint32_t fileid, log_file, log_offset, opcount, optype, rectype;
int count, ret;
error_check(session->open_cursor(session, "log:", NULL, NULL, &cursor));
count = 0;
while ((ret = cursor->next(cursor)) == 0) {
error_check(cursor->get_key(cursor, &log_file, &log_offset, &opcount));
cursor, &txnid, &rectype, &optype, &fileid, &logrec_key, &logrec_value));
print_record(log_file, log_offset, opcount, rectype, optype, txnid, fileid, &logrec_key,
scan_end_check(ret == WT_NOTFOUND);
if (count < count_min) {
fprintf(stderr, "Expected minimum %d records, found %d\n", count_min, count);

The log cursor's key can be used to search for specific records in the log (assuming the record still exists and has not been removed), by setting the key and calling WT_CURSOR::search. However, it is not possible to search for a specific operation within a log record, and the key's operation counter is ignored when the key is set. The result of a search for a log record with more than one operation is always the first operation in the log record.

Here is an example of setting the log cursor's key:

cursor->set_key(cursor, save_file, save_offset, 0);

Log cursors are read-only, however applications can insert their own log records using WT_SESSION::log_printf. Here is an example of adding an application record into the database log:

error_check(session->log_printf(session, "Wrote %d records", record_count));
int open_cursor(WT_SESSION *session, const char *uri, WT_CURSOR *to_dup, const char *config, WT_CURSOR **cursorp)
Open a new cursor on a data source or duplicate an existing cursor.
const void * data
The memory reference of the data item.
int get_key(WT_CURSOR *cursor,...)
Get the key for the current record.
A WT_CURSOR handle is the interface to a cursor.
size_t size
The number of bytes in the data item.
int next(WT_CURSOR *cursor)
Return the next record.
A raw item of data to be managed, including a pointer to the data and a length.
int get_value(WT_CURSOR *cursor,...)
Get the value for the current record.
int close(WT_CURSOR *cursor)
Close the cursor.
int log_printf(WT_SESSION *session, const char *format,...)
Insert a WT_LOGREC_MESSAGE type record in the database log files (the database must be configured for...
void set_key(WT_CURSOR *cursor,...)
Set the key for the next operation.
Item not found.
All data operations are performed in the context of a WT_SESSION.