Version 11.1.0
Log File Format
Data StructuresSource Location
WT_LOGSLOT
WT_LOG_RECORD
WT_LSN
src/include/log.h
src/log/

Caution: the Architecture Guide is not updated in lockstep with the code base and is not necessarily correct or complete for any specific release.

This page assumes familiarity with the logging subsystem. Please see Logging for a description of that subsystem.

Each log file begins with a fixed length header, followed by a set of variable length log records. The fixed length header contains information necessary for WiredTiger to determine that a file is a valid log file. The header contains a magic number in the first 32 bits, the log version number and the configured maximum log file size.

When a new log file is created WiredTiger must synchronously write out the header before allowing any user records to be written to the file. The content of the file header is identical for every log file. This fact allows for pre-allocation of log files so that user threads that happen to be writing out log records that require a log file switch are usually not penalized with expensive I/O operations. The typical log file switch is a rename instead. In addition to writing out the header, pre-allocation also allocates the space for the maximum size of the log file so that the risk of out of space errors is minimized. Pre-allocation is done with an internal thread as described in Internal Threads.

Log files from log version 1 are generated by releases before WiredTiger version 3.0.0. In log version 1, user records start immediately after the header. All log files from log version 2 onward write a system log record after the header as the first record in the log file. Writing the system record is part of creating the new log file.

All records in the log, including the header, are subject to the minimum record size. Typically that is 128 bytes. That value may be changed if direct I/O for logging is turned on. When enabling direct I/O via direct_io=log to wiredtiger_open then the minimum alignment is the value specified for buffer_alignment. In all other cases it is 128 bytes.

The reason that 128 bytes was chosen as the minimum record size was to eliminate any possibility of false sharing in memory and invalidating cache lines when multiple threads are writing their log records to the memory buffers in parallel. False sharing would potentially impact performance.

From version 2 onward the first record in the log file, typically at offset 128 in the file, is a special system log record that contains the LSN of the end of the previous log file. For the first log file that value is an invalid LSN. For later log files the LSN indicates the end of the previous file so that recovery can detect missing log records and holes in log files that appear at the end of a file.

As described in Logging subsystem data structures and algorithms, multiple threads are writing records to the log in a lock free manner and multiple log buffer slots may be in flight at any given time. A hole can be generated in a log file if a buffer with a later LSN is written before a buffer with an earlier LSN. That can also happen at a log file boundary and that is why knowing the LSN at the end of the previous log file is critical to recovery.

The user can choose the maximum log file size via "log=(file_max=size)" configuration to the wiredtiger_open call. Records written in the log are varying length depending on the data written. In typical usage, the system will choose to switch log files before writing a log buffer that exceeds the configured file size. However it is possible that a single record itself could be larger than the configured file size. In that case the system must allow that record to exceed the maximum file size. But it will be the only record in that log file.