Version 3.0.0
Log cursors

WiredTiger cursors provide access to data from a variety of sources, and one of these sources is the records in the transaction log files. Log files may not be present in every WiredTiger database, only databases that have been configured for logging using the log configuration for wiredtiger_open. In databases with log files, a log cursor provides access to the log records. Although log cursors are read-only, applications may store records in the log using WT_SESSION::log_printf.

Each physical WiredTiger log record represents one or more operations in the database. When a log record represents more than a single operation in the database, all of the operations in a log record will be part of the same transaction, however, there is no corresponding guarantee that all operations in a transaction must appear in the same log record.

The following examples are taken from the complete example program ex_log.c.

To open a log cursor on the database:

error_check(session->open_cursor(session, "log:", NULL, NULL, &cursor));

A log cursor's key is a unique log record identifier, plus a uint32_t operation counter within that log record. When a log record maps one-to-one to a transaction (in other words, the returned log record has the only database operation the transaction made), the operation counter returned for the key will be zero.

The unique log record identifier maps to a WT_LSN data structure, which has two fields: WT_LSN::id, the log file identifier, and WT_LSN::offset, the offset of the log record in the log file.

Here is an example of getting the log cursor's key:

error_check(cursor->get_key(
cursor, &log_file, &log_offset, &opcount));

The log cursor's value is comprised of six fields:

  • a uint64_t transaction ID (set for commit records only, otherwise 0),
  • a uint32_t record type
  • a uint32_t operation type (set for commit records only, otherwise 0)
  • a uint32_t file id (if applicable, otherwise 0)
  • the operation key (commit records only, otherwise empty)
  • the operation value

The transaction ID may not be unique across recovery, that is, closing and reopening the database may result in transaction IDs smaller than previously seen transaction IDs.

The record and operation types are taken from log_types; typically, the only record or operation type applications are concerned with is WT_LOGREC_MESSAGE, which is a log record generated by the application.

The file ID may not be unique across recovery, that is, closing and reopening the database may result in file IDs changing. Additionally, there is currently no way to map file IDs to file names or higher-level objects.

Here is an example of getting the log cursor's value:

error_check(cursor->get_value(cursor, &txnid,
&rectype, &optype, &fileid, &logrec_key, &logrec_value));

For clarity, imagine a set of three log records:

  • the first with a single operation,
  • the second with five operations,
  • the third with a single operation.

The log cursor's WT_CURSOR::next call will return a total of seven records. The first time the log cursor will return a key with a unique log ID, a unique transaction ID, and an operation counter of 0. The next five returns from the log cursor will have a common log ID, a common transaction ID, and operation counters starting at 1 and ending at 5. The final return from the log cursor will again have a unique log ID, a unique transaction ID, and an operation counter of 0.

Here's a more complete example of walking the file file and displaying the results:

static void
print_record(uint32_t log_file, uint32_t log_offset, uint32_t opcount,
uint32_t rectype, uint32_t optype, uint64_t txnid, uint32_t fileid,
WT_ITEM *key, WT_ITEM *value)
{
printf(
"LSN [%" PRIu32 "][%" PRIu32 "].%" PRIu32
": record type %" PRIu32 " optype %" PRIu32
" txnid %" PRIu64 " fileid %" PRIu32,
log_file, log_offset, opcount,
rectype, optype, txnid, fileid);
printf(" key size %zu value size %zu\n", key->size, value->size);
if (rectype == WT_LOGREC_MESSAGE)
printf("Application Record: %s\n", (char *)value->data);
}
/*
* simple_walk_log --
* A simple walk of the log.
*/
static void
simple_walk_log(WT_SESSION *session, int count_min)
{
WT_CURSOR *cursor;
WT_ITEM logrec_key, logrec_value;
uint64_t txnid;
uint32_t fileid, log_file, log_offset, opcount, optype, rectype;
int count, ret;
error_check(session->open_cursor(session, "log:", NULL, NULL, &cursor));
count = 0;
while ((ret = cursor->next(cursor)) == 0) {
count++;
error_check(cursor->get_key(
cursor, &log_file, &log_offset, &opcount));
error_check(cursor->get_value(cursor, &txnid,
&rectype, &optype, &fileid, &logrec_key, &logrec_value));
print_record(log_file, log_offset, opcount,
rectype, optype, txnid, fileid, &logrec_key, &logrec_value);
}
scan_end_check(ret == WT_NOTFOUND);
error_check(cursor->close(cursor));
if (count < count_min) {
fprintf(stderr,
"Expected minimum %d records, found %d\n",
count_min, count);
exit (1);
}
}

The log cursor's key can be used to search for specific records in the log (assuming the record still exists and has not been archived), by setting the key and calling WT_CURSOR::search. However, it is not possible to search for a specific operation within a log record, and the key's operation counter is ignored when the key is set. The result of a search for a log record with more than one operation is always the first operation in the log record.

Here is an example of setting the log cursor's key:

cursor->set_key(cursor, save_file, save_offset, 0);

Log cursors are read-only, however applications can insert their own log records using WT_SESSION::log_printf. Here is an example of adding an application record into the database log:

error_check(
session->log_printf(session, "Wrote %d records", record_count));