Version 2.8.0
WT_COMPRESSOR Struct Reference

The interface implemented by applications to provide custom compression. More...

Public Attributes

int(* compress )(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, uint8_t *dst, size_t dst_len, size_t *result_lenp, int *compression_failed)
 Callback to compress a chunk of data. More...
 
int(* compress_raw )(WT_COMPRESSOR *compressor, WT_SESSION *session, size_t page_max, int split_pct, size_t extra, uint8_t *src, uint32_t *offsets, uint32_t slots, uint8_t *dst, size_t dst_len, int final, size_t *result_lenp, uint32_t *result_slotsp)
 Callback to compress a list of byte strings. More...
 
int(* decompress )(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, uint8_t *dst, size_t dst_len, size_t *result_lenp)
 Callback to decompress a chunk of data. More...
 
int(* pre_size )(WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, size_t *result_lenp)
 Callback to size a destination buffer for compression. More...
 
int(* terminate )(WT_COMPRESSOR *compressor, WT_SESSION *session)
 If non-NULL, a callback performed when the database is closed. More...
 

Detailed Description

The interface implemented by applications to provide custom compression.

Compressors must implement the WT_COMPRESSOR interface: the WT_COMPRESSOR::compress and WT_COMPRESSOR::decompress callbacks must be specified, and WT_COMPRESSOR::pre_size is optional. To build your own compressor, use one of the compressors in ext/compressors as a template: ext/nop_compress is a simple compressor that passes through data unchanged, and is a reasonable starting point.

Applications register their implementation with WiredTiger by calling WT_CONNECTION::add_compressor.

/* Local compressor structure. */
typedef struct {
WT_COMPRESSOR compressor; /* Must come first */
WT_EXTENSION_API *wt_api; /* Extension API */
unsigned long nop_calls; /* Count of calls */
} NOP_COMPRESSOR;
/*
* wiredtiger_extension_init --
* A simple shared library compression example.
*/
int
{
NOP_COMPRESSOR *nop_compressor;
(void)config; /* Unused parameters */
if ((nop_compressor = calloc(1, sizeof(NOP_COMPRESSOR))) == NULL)
return (errno);
/*
* Allocate a local compressor structure, with a WT_COMPRESSOR structure
* as the first field, allowing us to treat references to either type of
* structure as a reference to the other type.
*
* Heap memory (not static), because it can support multiple databases.
*/
nop_compressor->compressor.compress = nop_compress;
nop_compressor->compressor.compress_raw = NULL;
nop_compressor->compressor.decompress = nop_decompress;
nop_compressor->compressor.pre_size = nop_pre_size;
nop_compressor->compressor.terminate = nop_terminate;
nop_compressor->wt_api = connection->get_extension_api(connection);
/* Load the compressor */
return (connection->add_compressor(
connection, "nop", (WT_COMPRESSOR *)nop_compressor, NULL));
}

Member Data Documentation

int(* WT_COMPRESSOR::compress) (WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, uint8_t *dst, size_t dst_len, size_t *result_lenp, int *compression_failed)

Callback to compress a chunk of data.

WT_COMPRESSOR::compress takes a source buffer and a destination buffer, by default of the same size. If the callback can compress the buffer to a smaller size in the destination, it does so, sets the compression_failed return to 0 and returns 0. If compression does not produce a smaller result, the callback sets the compression_failed return to 1 and returns 0. If another error occurs, it returns an errno or WiredTiger error code.

On entry, src will point to memory, with the length of the memory in src_len. After successful completion, the callback should return 0 and set result_lenp to the number of bytes required for the compressed representation.

On entry, dst points to the destination buffer with a length of dst_len. If the WT_COMPRESSOR::pre_size method is specified, the destination buffer will be at least the size returned by that method; otherwise, the destination buffer will be at least as large as the length of the data to compress.

If compression would not shrink the data or the dst buffer is not large enough to hold the compressed data, the callback should set compression_failed to a non-zero value and return 0.

Parameters
[in]srcthe data to compress
[in]src_lenthe length of the data to compress
[in]dstthe destination buffer
[in]dst_lenthe length of the destination buffer
[out]result_lenpthe length of the compressed data
[out]compression_failednon-zero if compression did not decrease the length of the data (compression may not have completed)
Returns
zero for success, non-zero to indicate an error.
/*
* nop_compress --
* A simple compression example that passes data through unchanged.
*/
static int
nop_compress(WT_COMPRESSOR *compressor, WT_SESSION *session,
uint8_t *src, size_t src_len,
uint8_t *dst, size_t dst_len,
size_t *result_lenp, int *compression_failed)
{
NOP_COMPRESSOR *nop_compressor = (NOP_COMPRESSOR *)compressor;
(void)session; /* Unused parameters */
++nop_compressor->nop_calls; /* Call count */
*compression_failed = 0;
if (dst_len < src_len) {
*compression_failed = 1;
return (0);
}
memcpy(dst, src, src_len);
*result_lenp = src_len;
return (0);
}
int(* WT_COMPRESSOR::compress_raw) (WT_COMPRESSOR *compressor, WT_SESSION *session, size_t page_max, int split_pct, size_t extra, uint8_t *src, uint32_t *offsets, uint32_t slots, uint8_t *dst, size_t dst_len, int final, size_t *result_lenp, uint32_t *result_slotsp)

Callback to compress a list of byte strings.

WT_COMPRESSOR::compress_raw gives applications fine-grained control over disk block size when writing row-store or variable-length column-store pages. Where this level of control is not required by the underlying storage device, set the WT_COMPRESSOR::compress_raw callback to NULL and WiredTiger will internally split each page into blocks, each block then compressed by WT_COMPRESSOR::compress.

WT_COMPRESSOR::compress_raw takes a source buffer and an array of 0-based offsets of byte strings in that buffer. The callback then encodes none, some or all of the byte strings and copies the encoded representation into a destination buffer. The callback returns the number of byte strings encoded and the bytes needed for the encoded representation. The encoded representation has header information prepended and is written as a block to the underlying file object.

On entry, page_max is the configured maximum size for objects of this type. (This value is provided for convenience, and will be either the internal_page_max or leaf_page_max value specified to WT_SESSION::create when the object was created.)

On entry, split_pct is the configured Btree page split size for this object. (This value is provided for convenience, and will be the split_pct value specified to WT_SESSION::create when the object was created.)

On entry, extra is a count of additional bytes that will be added to the encoded representation before it is written. In other words, if the target write size is 8KB, the returned encoded representation should be less than or equal to (8KB - extra). The method does not need to skip bytes in the destination buffer based on extra, the method should only use extra to decide how many bytes to store into the destination buffer for its ideal block size.

On entry, src points to the source buffer; offsets is an array of slots 0-based offsets into src, where each offset is the start of a byte string, except for the last offset, which is the offset of the first byte past the end of the last byte string. (In other words, offsets[0] will be 0, the offset of the first byte of the first byte string in src, and offsets[slots] is the total length of all of the byte strings in the src buffer.)

On entry, dst points to the destination buffer with a length of dst_len. If the WT_COMPRESSOR::pre_size method is specified, the destination buffer will be at least the size returned by that method; otherwise, the destination buffer will be at least as large as the length of the data to compress.

After successful completion, the callback should return 0, and set result_slotsp to the number of byte strings encoded and result_lenp to the bytes needed for the encoded representation.

There is no requirement the callback encode any or all of the byte strings passed by WiredTiger. If the callback does not encode any of the byte strings and compression should not be retried, the callback should set result_slotsp to 0.

If the callback does not encode any of the byte strings and compression should be retried with additional byte strings, the callback must return EAGAIN. In that case, WiredTiger will accumulate more rows and repeat the call.

If there are no more rows to accumulate or the callback indicates that it cannot be retried, WiredTiger writes the remaining rows using WT_COMPRESSOR::compress.

On entry, final is zero if there are more rows to be written as part of this page (if there will be additional data provided to the callback), and non-zero if there are no more rows to be written as part of this page. If final is set and the callback fails to encode any rows, WiredTiger writes the remaining rows without further calls to the callback. If final is set and the callback encodes any number of rows, WiredTiger continues to call the callback until all of the rows are encoded or the callback fails to encode any rows.

The WT_COMPRESSOR::compress_raw callback is intended for applications wanting to create disk blocks in specific sizes. WT_COMPRESSOR::compress_raw is not a replacement for WT_COMPRESSOR::compress: objects which WT_COMPRESSOR::compress_raw cannot handle (for example, overflow key or value items), or which WT_COMPRESSOR::compress_raw chooses not to compress for any reason (for example, if WT_COMPRESSOR::compress_raw callback chooses not to compress a small number of rows, but the page being written has no more rows to accumulate), will be passed to WT_COMPRESSOR::compress.

The WT_COMPRESSOR::compress_raw callback is only called for objects where it is applicable, that is, for row-store and variable-length column-store objects, where both row-store key prefix compression and row-store and variable-length column-store dictionary compression are not configured. When WT_COMPRESSOR::compress_raw is not applicable, the WT_COMPRESSOR::compress callback is used instead.

Parameters
[in]page_maxthe configured maximum page size for this object
[in]split_pctthe configured page split size for this object
[in]extrathe count of the additional bytes
[in]srcthe data to compress
[in]offsetsthe byte offsets of the byte strings in src
[in]slotsthe number of entries in offsets
[in]dstthe destination buffer
[in]dst_lenthe length of the destination buffer
[in]finalnon-zero if there are no more rows to accumulate
[out]result_lenpthe length of the compressed data
[out]result_slotspthe number of byte offsets taken
Returns
zero for success, non-zero to indicate an error.
int(* WT_COMPRESSOR::decompress) (WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, uint8_t *dst, size_t dst_len, size_t *result_lenp)

Callback to decompress a chunk of data.

WT_COMPRESSOR::decompress takes a source buffer and a destination buffer. The contents are switched from compress: the source buffer is the compressed value, and the destination buffer is sized to be the original size. If the callback successfully decompresses the source buffer to the destination buffer, it returns 0. If an error occurs, it returns an errno or WiredTiger error code. The source buffer that WT_COMPRESSOR::decompress takes may have a size that is rounded up from the size originally produced by WT_COMPRESSOR::compress, with the remainder of the buffer set to zeroes. Most compressors do not care about this difference if the size to be decompressed can be implicitly discovered from the compressed data. If your compressor cares, you may need to allocate space for, and store, the actual size in the compressed buffer. See the source code for the included snappy compressor for an example.

On entry, src will point to memory, with the length of the memory in src_len. After successful completion, the callback should return 0 and set result_lenp to the number of bytes required for the decompressed representation.

If the dst buffer is not big enough to hold the decompressed data, the callback should return an error.

Parameters
[in]srcthe data to decompress
[in]src_lenthe length of the data to decompress
[in]dstthe destination buffer
[in]dst_lenthe length of the destination buffer
[out]result_lenpthe length of the decompressed data
Returns
zero for success, non-zero to indicate an error.
/*
* nop_decompress --
* A simple decompression example that passes data through unchanged.
*/
static int
nop_decompress(WT_COMPRESSOR *compressor, WT_SESSION *session,
uint8_t *src, size_t src_len,
uint8_t *dst, size_t dst_len,
size_t *result_lenp)
{
NOP_COMPRESSOR *nop_compressor = (NOP_COMPRESSOR *)compressor;
(void)session; /* Unused parameters */
(void)src_len;
++nop_compressor->nop_calls; /* Call count */
/*
* The destination length is the number of uncompressed bytes we're
* expected to return.
*/
memcpy(dst, src, dst_len);
*result_lenp = dst_len;
return (0);
}
int(* WT_COMPRESSOR::pre_size) (WT_COMPRESSOR *compressor, WT_SESSION *session, uint8_t *src, size_t src_len, size_t *result_lenp)

Callback to size a destination buffer for compression.

WT_COMPRESSOR::pre_size is an optional callback that, given the source buffer and size, produces the size of the destination buffer to be given to WT_COMPRESSOR::compress. This is useful for compressors that assume that the output buffer is sized for the worst case and thus no overrun checks are made. If your compressor works like this, WT_COMPRESSOR::pre_size will need to be defined. See the source code for the snappy compressor for an example. However, if your compressor detects and avoids overruns against its target buffer, you will not need to define WT_COMPRESSOR::pre_size. When WT_COMPRESSOR::pre_size is set to NULL, the destination buffer is sized the same as the source buffer. This is always sufficient, since a compression result that is larger than the source buffer is discarded by WiredTiger.

If not NULL, this callback is called before each call to WT_COMPRESSOR::compress to determine the size of the destination buffer to provide. If the callback is NULL, the destination buffer will be the same size as the source buffer.

The callback should set result_lenp to a suitable buffer size for compression, typically the maximum length required by WT_COMPRESSOR::compress.

This callback function is for compressors that require an output buffer larger than the source buffer (for example, that do not check for buffer overflow during compression).

Parameters
[in]srcthe data to compress
[in]src_lenthe length of the data to compress
[out]result_lenpthe required destination buffer size
Returns
zero for success, non-zero to indicate an error.
/*
* nop_pre_size --
* A simple pre-size example that returns the source length.
*/
static int
nop_pre_size(WT_COMPRESSOR *compressor, WT_SESSION *session,
uint8_t *src, size_t src_len,
size_t *result_lenp)
{
NOP_COMPRESSOR *nop_compressor = (NOP_COMPRESSOR *)compressor;
(void)session; /* Unused parameters */
(void)src;
++nop_compressor->nop_calls; /* Call count */
*result_lenp = src_len;
return (0);
}
int(* WT_COMPRESSOR::terminate) (WT_COMPRESSOR *compressor, WT_SESSION *session)

If non-NULL, a callback performed when the database is closed.

The WT_COMPRESSOR::terminate callback is intended to allow cleanup, the handle will not be subsequently accessed by WiredTiger.

/*
* nop_terminate --
* WiredTiger no-op compression termination.
*/
static int
nop_terminate(WT_COMPRESSOR *compressor, WT_SESSION *session)
{
NOP_COMPRESSOR *nop_compressor = (NOP_COMPRESSOR *)compressor;
(void)session; /* Unused parameters */
++nop_compressor->nop_calls; /* Call count */
/* Free the allocated memory. */
free(compressor);
return (0);
}