The interface implemented by applications to provide custom compression. More...
Public Attributes | |
int(* | compress )(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, uint8_t *dst, size_t dst_len, size_t *result_lenp, int *compression_failed) |
Callback to compress a chunk of data. More... | |
int(* | compress_raw )(WT_COMPRESSOR *compressor, WT_SESSION *session, size_t page_max, int split_pct, size_t extra, uint8_t *src, uint32_t *offsets, uint32_t slots, uint8_t *dst, size_t dst_len, int final, size_t *result_lenp, uint32_t *result_slotsp) |
Callback to compress a list of byte strings. More... | |
int(* | decompress )(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, uint8_t *dst, size_t dst_len, size_t *result_lenp) |
Callback to decompress a chunk of data. More... | |
int(* | pre_size )(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, size_t *result_lenp) |
Callback to size a destination buffer for compression. More... | |
int(* | terminate )(WT_COMPRESSOR *compressor, WT_SESSION *session) |
If non-NULL, a callback performed when the database is closed. More... | |
The interface implemented by applications to provide custom compression.
Compressors must implement the WT_COMPRESSOR interface: the WT_COMPRESSOR::compress and WT_COMPRESSOR::decompress callbacks must be specified, and WT_COMPRESSOR::pre_size is optional. To build your own compressor, use one of the compressors in ext/compressors
as a template: ext/nop_compress
is a simple compressor that passes through data unchanged, and is a reasonable starting point.
Applications register their implementation with WiredTiger by calling WT_CONNECTION::add_compressor.
int(* WT_COMPRESSOR::compress)(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, uint8_t *dst, size_t dst_len, size_t *result_lenp, int *compression_failed) |
Callback to compress a chunk of data.
WT_COMPRESSOR::compress takes a source buffer and a destination buffer, by default of the same size. If the callback can compress the buffer to a smaller size in the destination, it does so, sets the compression_failed
return to 0 and returns 0. If compression does not produce a smaller result, the callback sets the compression_failed
return to 1 and returns 0. If another error occurs, it returns an errno or WiredTiger error code.
On entry, src
will point to memory, with the length of the memory in src_len
. After successful completion, the callback should return 0
and set result_lenp
to the number of bytes required for the compressed representation.
On entry, dst
points to the destination buffer with a length of dst_len
. If the WT_COMPRESSOR::pre_size method is specified, the destination buffer will be at least the size returned by that method; otherwise, the destination buffer will be at least as large as src_len
.
If compression would not shrink the data or the dst
buffer is not large enough to hold the compressed data, the callback should set compression_failed
to a non-zero value and return 0.
[in] | src | the data to compress |
[in] | src_len | the length of the data to compress |
[in] | dst | the destination buffer |
[in] | dst_len | the length of the destination buffer |
[out] | result_lenp | the length of the compressed data |
[out] | compression_failed | non-zero if compression did not decrease the length of the data (compression may not have completed) |
int(* WT_COMPRESSOR::compress_raw)(WT_COMPRESSOR *compressor, WT_SESSION *session, size_t page_max, int split_pct, size_t extra, uint8_t *src, uint32_t *offsets, uint32_t slots, uint8_t *dst, size_t dst_len, int final, size_t *result_lenp, uint32_t *result_slotsp) |
Callback to compress a list of byte strings.
WT_COMPRESSOR::compress_raw gives applications fine-grained control over disk block size when writing row-store or variable-length column-store pages. Where this level of control is not required by the underlying storage device, set the WT_COMPRESSOR::compress_raw callback to NULL
and WiredTiger will internally split each page into blocks, each block then compressed by WT_COMPRESSOR::compress.
WT_COMPRESSOR::compress_raw takes a source buffer and an array of 0-based offsets of byte strings in that buffer. The callback then encodes none, some or all of the byte strings and copies the encoded representation into a destination buffer. The callback returns the number of byte strings encoded and the bytes needed for the encoded representation. The encoded representation has header information prepended and is written as a block to the underlying file object.
On entry, page_max
is the configured maximum size for objects of this type. (This value is provided for convenience, and will be either the internal_page_max
or leaf_page_max
value specified to WT_SESSION::create when the object was created.)
On entry, split_pct
is the configured Btree page split size for this object. (This value is provided for convenience, and will be the split_pct
value specified to WT_SESSION::create when the object was created.)
On entry, extra
is a count of additional bytes that will be added to the encoded representation before it is written. In other words, if the target write size is 8KB, the returned encoded representation should be less than or equal to (8KB - extra
). The method does not need to skip bytes in the destination buffer based on extra
, the method should only use extra
to decide how many bytes to store into the destination buffer for its ideal block size.
On entry, src
points to the source buffer; offsets
is an array of slots
0-based offsets into src
, where each offset is the start of a byte string, except for the last offset, which is the offset of the first byte past the end of the last byte string. (In other words, offsets[0]
will be 0, the offset of the first byte of the first byte string in src
, and offsets[slots]
is the total length of all of the byte strings in the src
buffer.)
On entry, dst
points to the destination buffer with a length of dst_len
. If the WT_COMPRESSOR::pre_size method is specified, the destination buffer will be at least the size returned by that method; otherwise, the destination buffer will be at least the maximum size for the page being written (that is, when writing a row-store leaf page, the destination buffer will be at least as large as the leaf_page_max
configuration value).
After successful completion, the callback should return 0
, and set result_slotsp
to the number of byte strings encoded and result_lenp
to the bytes needed for the encoded representation.
There is no requirement the callback encode any or all of the byte strings passed by WiredTiger. If the callback does not encode any of the byte strings and compression should not be retried, the callback should set result_slotsp
to 0.
If the callback does not encode any of the byte strings and compression should be retried with additional byte strings, the callback must return EAGAIN
. In that case, WiredTiger will accumulate more rows and repeat the call.
If there are no more rows to accumulate or the callback indicates that it cannot be retried, WiredTiger writes the remaining rows using WT_COMPRESSOR::compress
.
On entry, final
is zero if there are more rows to be written as part of this page (if there will be additional data provided to the callback), and non-zero if there are no more rows to be written as part of this page. If final
is set and the callback fails to encode any rows, WiredTiger writes the remaining rows without further calls to the callback. If final
is set and the callback encodes any number of rows, WiredTiger continues to call the callback until all of the rows are encoded or the callback fails to encode any rows.
The WT_COMPRESSOR::compress_raw callback is intended for applications wanting to create disk blocks in specific sizes. WT_COMPRESSOR::compress_raw is not a replacement for WT_COMPRESSOR::compress: objects which WT_COMPRESSOR::compress_raw cannot handle (for example, overflow key or value items), or which WT_COMPRESSOR::compress_raw chooses not to compress for any reason (for example, if WT_COMPRESSOR::compress_raw callback chooses not to compress a small number of rows, but the page being written has no more rows to accumulate), will be passed to WT_COMPRESSOR::compress.
The WT_COMPRESSOR::compress_raw callback is only called for objects where it is applicable, that is, for row-store and variable-length column-store objects, where both row-store key prefix compression and row-store and variable-length column-store dictionary compression are not configured. When WT_COMPRESSOR::compress_raw is not applicable, the WT_COMPRESSOR::compress callback is used instead.
[in] | page_max | the configured maximum page size for this object |
[in] | split_pct | the configured page split size for this object |
[in] | extra | the count of the additional bytes |
[in] | src | the data to compress |
[in] | offsets | the byte offsets of the byte strings in src |
[in] | slots | the number of entries in offsets |
[in] | dst | the destination buffer |
[in] | dst_len | the length of the destination buffer |
[in] | final | non-zero if there are no more rows to accumulate |
[out] | result_lenp | the length of the compressed data |
[out] | result_slotsp | the number of byte offsets taken |
int(* WT_COMPRESSOR::decompress)(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, uint8_t *dst, size_t dst_len, size_t *result_lenp) |
Callback to decompress a chunk of data.
WT_COMPRESSOR::decompress takes a source buffer and a destination buffer. The contents are switched from compress:
the source buffer is the compressed value, and the destination buffer is sized to be the original size. If the callback successfully decompresses the source buffer to the destination buffer, it returns 0. If an error occurs, it returns an errno or WiredTiger error code. The source buffer that WT_COMPRESSOR::decompress takes may have a size that is rounded up from the size originally produced by WT_COMPRESSOR::compress, with the remainder of the buffer set to zeroes. Most compressors do not care about this difference if the size to be decompressed can be implicitly discovered from the compressed data. If your compressor cares, you may need to allocate space for, and store, the actual size in the compressed buffer. See the source code for the included snappy compressor for an example.
On entry, src
will point to memory, with the length of the memory in src_len
. After successful completion, the callback should return 0
and set result_lenp
to the number of bytes required for the decompressed representation.
If the dst
buffer is not big enough to hold the decompressed data, the callback should return an error.
[in] | src | the data to decompress |
[in] | src_len | the length of the data to decompress |
[in] | dst | the destination buffer |
[in] | dst_len | the length of the destination buffer |
[out] | result_lenp | the length of the decompressed data |
int(* WT_COMPRESSOR::pre_size)(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, size_t *result_lenp) |
Callback to size a destination buffer for compression.
WT_COMPRESSOR::pre_size is an optional callback that, given the source buffer and size, produces the size of the destination buffer to be given to WT_COMPRESSOR::compress. This is useful for compressors that assume that the output buffer is sized for the worst case and thus no overrun checks are made. If your compressor works like this, WT_COMPRESSOR::pre_size will need to be defined. See the source code for the snappy compressor for an example. However, if your compressor detects and avoids overruns against its target buffer, you will not need to define WT_COMPRESSOR::pre_size. When WT_COMPRESSOR::pre_size is set to NULL, the destination buffer is sized the same as the source buffer. This is always sufficient, since a compression result that is larger than the source buffer is discarded by WiredTiger.
If not NULL, this callback is called before each call to WT_COMPRESS::compress to determine the size of the destination buffer to provide. If the callback is NULL, the destination buffer will be the same size as the source buffer.
The callback should set result_lenp
to a suitable buffer size for compression, typically the maximum length required by WT_COMPRESSOR::compress.
This callback function is for compressors that require an output buffer larger than the source buffer (for example, that do not check for buffer overflow during compression).
[in] | src | the data to compress |
[in] | src_len | the length of the data to compress |
[out] | result_lenp | the required destination buffer size |
int(* WT_COMPRESSOR::terminate)(WT_COMPRESSOR *compressor, WT_SESSION *session) |
If non-NULL, a callback performed when the database is closed.
The WT_COMPRESSOR::terminate callback is intended to allow cleanup, the handle will not be subsequently accessed by WiredTiger.