Version 12.0.0
All Classes Functions Variables Typedefs Enumerations Enumerator Modules Pages
Row Store and Column Store
Data StructuresSource Location
WT_BTREEsrc/include/btree.h

Caution: the Architecture Guide is not updated in lockstep with the code base and is not necessarily correct or complete for any specific release.

Definition

WiredTiger supports both row-stores and column-stores and each access method is represented internally by a B-Tree. Keys and values in row-stores can be of variable length. Furthermore, keys are explicitly stored. Column-stores use a 64-bit unsigned integer record id as keys and they are implied given the row's position in the B-Tree.

There are two types of column-stores in WiredTiger:

  • The fixed length column-store, as its name implies, the value has a fixed length, and furthermore the value is restricted to between 1 and 8 bits in length. The bit length is specified when the column store is created. The fixed length column store has specialized use cases like bitmaps.
  • The variable length is a more general case which allows for values to have any length similarly to row-stores.

Please refer to Data File Format to learn more about the on-disk format for row-stores and column-stores.

Row-stores

Row-stores are probably seen as the traditional access method and they serve a general purpose. In WiredTiger, row-store keys are explicitly stored and they can be duplicated within the same single file which directly impacts the on-disk storage size. However, keys are only duplicated if they are the first key on the page, which gets them copied/promoted to an internal page. They are not necessarily duplicated, too, they can be prefix/suffix compressed to discard unnecessary bytes on the internal pages.

Variable length column-stores

Keys in column-stores are not stored but derived from the row's position in the B-Tree which saves disk space. In fact, there is a starting key on each page which is used to figure out the rest of the keys present on that same page. Column-stores values also present an advantage to further save more disk space as they can be written in more compact forms through encoding techniques. In WiredTiger, the run-length encoding is used to replace any sequence of the same value with a count and value indicator. However, this makes the access to a specific record more complicated, it makes it impossible to jump to a record in a leaf page.

Fixed length column-stores

Fixed length column stores are very different internally and usually serve specific purposes as it is a Bitmap. It makes fixed length column-stores efficient to retrieve a value at a given offset, the use of Bloom filter is probably one of the best examples.

Internal usage

Internally, row-stores and column-stores use a common WT_BTREE structure. The fundamental difference is that the WT_BTREE->type field is set to BTREE_ROW for row-stores and BTREE_COL for column-stores. Internal functions that navigate, access and manipulate B-Trees have code sprinkled throughout that is conditional on WT_BTREE->type.

Row StoreColumn Store - Variable LengthColumn Store - Fixed Length
Characteristics
Internal representationB-Tree (WT_BTREE)B-Tree (WT_BTREE)B-Tree (WT_BTREE)
B-Tree typeGeneralized (BTREE_ROW)Specialized (BTREE_COL_VAR)Specialized (BTREE_COL_FIX)
Leaf pagesKey count equal to half the number of physical entries (unless all empty values flag is set where key count is equal to the number of physical entries)Only the first key is storedOnly the first key is stored
Record keyAnything (byte-string) prefix compressed or overflow objectsRecord id (64 bit unsigned integer)Record id (64 bit unsigned integer)
Record valueVariable byte string lengthVariable byte string lengthFixed bit string length (up to 8 bits)
Features
Random cursorYesNoNo
Block compressionYesYesYes
Dictionary compressionYesYesNo
Fast truncateYesNoNo
Prefix and suffix compressionYesNoNo
RLE compressionNoYesNo