The WiredTiger data engine is a high performance, scalable, transactional, production quality, open source, NoSQL data engine, created to maximize the value of each computer you buy:
WiredTiger's design is focused on a few core principles:
WiredTiger scales on modern, multi-CPU architectures. Using a variety of programming techniques such as hazard references, lock-free algorithms, fast latching and message passing, WiredTiger performs more work per CPU core than alternative engines.
WiredTiger's transactions use optimistic concurrency control algorithms that avoid the bottleneck of a centralized lock manager. Transactional operations in one thread do not block operations in other threads, but strong isolation is provided and update conflicts are detected to preserve data consistency.
WiredTiger supports both row-oriented storage (where all columns of a row are stored together), and column-oriented storage (where groups of columns are stored in separate files), resulting in more efficient memory use. When reading and writing column-stores, only the columns required for any particular query are maintained in memory. Column-store keys are derived from the value's location in the table rather than being physically stored in the table, further minimizing memory requirements. Finally, row-and column-stores can be mixed-and-matched at the table level: for example, a row-store index can be created on a column-store table.
WiredTiger supports different-sized Btree internal and leaf pages in the same file. Applications can maximize the amount of data transferred in each I/O by configuring large leaf pages, and still minimize CPU cache misses when searching the tree.
WiredTiger supports static encoding with a configurable Huffman engine, which typically reduces the amount of information maintained in memory by 20-50%.
WiredTiger supports key prefix encoding, reducing the number of bytes from each key maintained in memory.
WiredTiger uses compact file formats to minimize on-disk overhead. WiredTiger does not store data indexing information on disk, instead, WiredTiger instantiates data indexing information either when pages are read from disk or on demand. This simplifies the on-disk file format and in the case of small key/value pairs, typically reduces the amount of information written to disk by 20-50%.
WiredTiger supports variable-length pages, meaning there is less wasted space for large objects, and no need for compaction as pages grow and shrink naturally when key/value pairs are inserted or deleted.
WiredTiger supports stream compression on every page of a table. Because WiredTiger supports variable-length pages, pages do not have to shrink by a fixed amount in order to benefit from stream compression. Stream compression is selectable on a per-table basis, allowing applications to choose the compression algorithm most appropriate for their data. Stream compression typically reduces the amount of information written to disk by 30-80%.
WiredTiger supports leaf pages of up to 512MB in size. Disk seeks are less likely when reading large amounts of data from disk, significantly improving table scan performance.
Also, as noted in the Hot caches section, WiredTiger supports column-store formats, prefix compression and static encoding. While each of these features makes WiredTiger's use of memory more efficient, they also maximize the amount of useful data transferred per disk I/O.
WiredTiger is production quality, supported software, engineered for the most demanding application environments. For example, as a no-overwrite data engine, torn writes can never corrupt a WiredTiger data store.
WiredTiger includes verification support so you can verify data sets, and salvage support as a last-ditch protection: data can be retrieved even if it somehow becomes corrupted.
WiredTiger is an Open Source, NoSQL data engine. See the WiredTiger license for details.