Version 10.0.0
Custom Tiered Storage sources

Overview of Tiered Storage in WiredTiger

Applications can implement their own custom storage sources. WiredTiger does not currently offer builtin support for any particular storage source. The expected usage of a storage source is to implement and allow for cloud-based object storage.

Example storage code is provided to demonstrate how storage source extensions are created.

The storage source extension must be loaded in the wiredtiger_open call. See Extending WiredTiger for details on how extensions are loaded. Also, a storage source is specified using tiered_storage= in the configuration for the wiredtiger_open call. This configuration establishes the name and bucket to be used for database log files and a subset of the WiredTiger metadata files. By default, this same tiered storage source is also used for all data files. We call this the system storage source.

It is also possible to use different tiered storage options when individual data files are first created, using the tiered_storage= configuration in the WT_SESSION::create call. Such options override the default (system) storage that was indicated in the wiredtiger_open call for the individual data file. It is possible to turn off tiered storage for individual files using the reserved none name. It is also possible to use a different storage source, or to specify a different bucket.

Overriding the system storage source for a table does not override the system storage source for indices on that table, nor does it override the system storage source for column groups specified on that table. The storage source for column groups and indices must be specified when they are created, if they are to be different than the system storage source.

It is an error to specify a storage source in a WT_SESSION::create call when it was not specified in the wiredtiger_open call.

Storage source parameters

Several parameters, name, bucket, cluster and member, may be specified when configuring a storage source for wiredtiger_open to allow the possibility of varying the location according to different names and buckets.

The configuration parameter tiered_storage=(bucket=identifier) may be used in wiredtiger_open or WT_SESSION::create calls. This is intended to reference a location in the storage source for where to find or place objects.

The configuration parameter cluster=identifier is used only in the wiredtiger_open call. The value of the cluster is unchanging for this database. Its intent is to identify the objects belonging to this database so that they are unique in case multiple databases share an object-storage bucket. It must always be provided when WiredTiger is reopened (again, with the wiredtiger_open call).

Similarly, the configuration parameter member=identifier is used only in the wiredtiger_open call. There may be multiple nodes accessing the same cluster's database objects in the storage source. The member id will make the names of objects it creates unique from objects potentially created by other members of the same cluster. Its intent is to identify the objects belonging to this database so that they are unique in case multiple nodes writing to the bucket. It must always be provided when WiredTiger is reopened (again, with the wiredtiger_open call).

Storage source examples

An example of a storage source exists in ext/storage_sources/local_store/local_store.c. This storage source emulates cloud storage by storing all objects on the local file system. This example does not include application level code to call it. By default, WiredTiger builds it as a loadable shared library, and it can be loaded during a wiredtiger_open call as with any other extension, and local_store can be specified to be used with tiered storage system.