Version 11.1.0
Schema
Data StructuresSource Location
WT_COLGROUP
WT_INDEX
WT_LSM_TREE
WT_TABLE
src/include/intpack_inline.h
src/include/packing_inline.h
src/include/schema.h
src/lsm/
src/packing/
src/schema/

Caution: the Architecture Guide is not updated in lockstep with the code base and is not necessarily correct or complete for any specific release.

A schema defines the format of the application data and how it will be stored by WiredTiger. While many tables have simple key/value pairs for records, WiredTiger also supports more complex data patterns. See Schema, Columns, Column Groups, Indices and Projections for more information.

Data Formats

The format of keys and values is configured through key_format and value_format entries in Configuration Strings. WiredTiger supports simple or composite data formats for keys and values. See Format types for the full list of supported data types.

  • A simple format stores data in one type, for example "key_format=i,value_format=S".
  • A composite format can store multiple data types as a single blob. The configuration string can be something like "key_format=Si,value_format=ul". Cursors support encoding and decoding of these types of keys and values. See Data translation and Cursor formats for more details.

Column store requires the key format to be defined as the record number 'r' type. Schema, Columns, Column Groups, Indices and Projections has more information on key/value formats.

Data Files

Database schema defines how data files are organized in the database home folder:

  • A row-oriented table keeps all the data in one file called "<table name>.wt", where "<table name>" is the name that was passed as a part of the name parameter to WT_SESSION::create.
  • A column-oriented table stores the data in multiple files. One for each column group. The filename will be as follows "<table name>_<colgroup name>.wt". Where "<table name>" is the name that was specified as a part of the name parameter to WT_SESSION::create. And "<colgroup name>" is the column group name defined in the colgroups entry during the definition of the table format. See this example of how column groups can be configured in WiredTiger ex_col_store.c. Row Store and Column Store describes in more detail how row and column stores work.
  • Each table index is stored in a separate file "<table name>_<index name>.wti". Where "<table name>" is the table name passed into WT_SESSION::create. And "<index name>" is the index name defined during index creation. See Indices for more information on how to create a table index.
  • LSM trees are store on the file system in "<table name>-<chunk id>.lsm" files. Where "<table name>" is the name that was specified as a part of the name parameter to WT_SESSION::create. "<chunk id>" is the chunk index managed by WiredTiger. More information on LSM trees can be found on this page Log-Structured Merge Trees.
  • Bloom filters for LSM trees are stored in the files with the same name as the LSM tree chunk, but with a different extension: "<table name>-<chunk id>.bf". See Bloom filters for more details.

Schema Integrity

A user can create and manipulate database objects through the API listed on this page Schema Manipulation. There are several WiredTiger internal objects such as Metadata, History Store, etc. The schema of those objects is locked and cannot be altered from outside of WiredTiger.

Schema operations cause an update to the metadata and are performed under the schema lock to avoid concurrent operations on the database schema. The following sequence of steps define a generic schema operation:

Apart from the schema API, the schema lock is necessary for many other operations in WiredTiger including the following "heavy" database modifications:

  • The schema lock wraps checkpoint prepare to avoid any tables being created or dropped during this phase. See Checkpoint for details.
  • Rollback to stable operation acquires the schema lock to make sure no schema changes are done during this complex process. Rollback to Stable has more information on the operation.
  • A backup cursor also holds the schema lock because it must guarantee a consistent view of what files and tables exist while it is being used, so it prevents any tables or files being created or dropped during that time. See Backups and Backup cursors for more information.

All the schema operations listed below perform multi-step metadata modifications. Although they are non-transactional, the schema code tracks the metadata changes and performs the file and metadata operations in a specific order to provide recovery in the case of a crash.

Schema Manipulation

All schema manipulations are done in the context of WT_SESSION. All the methods below, except WT_SESSION::create and WT_SESSION::truncate, require exclusive access to the specified data source(s). If any cursors are open with the specified name(s) or a data source is otherwise in use, the call will fail and return EBUSY.

Create

The create schema operation is responsible for creating the underlying data objects on the filesystem and then creating required entries in the metadata. The API for this operation is WT_SESSION::create.

Drop

WT_SESSION::drop operation drops the specified uri. The method will delete all related files and metadata entries. It is possible to keep the underlying files by specifying "remove_files=false" in the config string.

Rename

WT_SESSION::rename schema operation renames the underlying data objects on the filesystem and updates the metadata accordingly.

Alter

WT_SESSION::alter allows modification of some table settings after creation.

Truncate

WT_SESSION::truncate truncates a file, table, cursor range, or backup cursor. If start and stop cursors are not specified all the data stored in the uri will be wiped out. When a range truncate is in progress, and another transaction inserts a key into that range, the behavior is not well defined. It is best to avoid this type of situations. See Truncate Operation for more details.