Caution: the Architecture Guide is not updated in lockstep with the code base and is not necessarily correct or complete for any specific release.
WiredTiger supports both row-stores and column-stores and each access method is represented internally by a B-Tree. Keys and values in row-stores can be of variable length. Furthermore, keys are explicitly stored. Column-stores use a 64-bit unsigned integer record id as keys and they are implied given the row's position in the B-Tree.
There are two types of column-stores in WiredTiger:
Please refer to Data File Format to learn more about the on-disk format for row-stores and column-stores.
Row-stores are probably seen as the traditional access method and they serve a general purpose. In WiredTiger, row-store keys are explicitly stored and they can be duplicated within the same single file which directly impacts the on-disk storage size. However, keys are only duplicated if they are the first key on the page, which gets them copied/promoted to an internal page. They are not necessarily duplicated, too, they can be prefix/suffix compressed to discard unnecessary bytes on the internal pages.
Keys in column-stores are not stored but derived from the row's position in the B-Tree which saves disk space. In fact, there is a starting key on each page which is used to figure out the rest of the keys present on that same page. Column-stores values also present an advantage to further save more disk space as they can be written in more compact forms through encoding techniques. In WiredTiger, the run-length encoding is used to replace any sequence of the same value with a count and value indicator. However, this makes the access to a specific record more complicated, it makes it impossible to jump to a record in a leaf page.
Fixed length column stores are very different internally and usually serve specific purposes as it is a Bitmap. It makes fixed length column-stores efficient to retrieve a value at a given offset, the use of Bloom filter is probably one of the best examples.
Internally, row-stores and column-stores use a common WT_BTREE
structure. The fundamental difference is that the WT_BTREE->type
field is set to BTREE_ROW
for row-stores and BTREE_COL
for column-stores. Internal functions that navigate, access and manipulate B-Trees have code sprinkled throughout that is conditional on WT_BTREE->type
.
Row Store | Column Store - Variable Length | Column Store - Fixed Length | ||
---|---|---|---|---|
Characteristics | ||||
Internal representation | B-Tree (WT_BTREE ) | B-Tree (WT_BTREE ) | B-Tree (WT_BTREE ) | |
B-Tree type | Generalized (BTREE_ROW ) | Specialized (BTREE_COL_VAR ) | Specialized (BTREE_COL_FIX ) | |
Leaf pages | Key count equal to half the number of physical entries (unless all empty values flag is set where key count is equal to the number of physical entries) | Only the first key is stored | Only the first key is stored | |
Record key | Anything (byte-string) prefix compressed or overflow objects | Record id (64 bit unsigned integer) | Record id (64 bit unsigned integer) | |
Record value | Variable byte string length | Variable byte string length | Fixed bit string length (up to 8 bits) | |
Features | ||||
Random cursor | Yes | No | No | |
Block compression | Yes | Yes | Yes | |
Dictionary compression | Yes | Yes | No | |
Fast truncate | Yes | No | No | |
Huffman encoding | Yes | Yes | No | |
Prefix and suffix compression | Yes | No | No | |
RLE compression | No | Yes | No |