The WiredTiger distribution includes a tool that can be used to simulate workloads in WiredTiger, in the directory bench/wtperf
.
The wtperf
utility generally has two phases, the populate phase which creates a database and then populates an object in that database, and a workload phase, that does some set of operations on the object.
For example, the following configuration uses a single thread to populate a file object with 500,000 records in a 500MB cache. The workload phase consists of 8 threads running for two minutes, all reading from the file.
conn_config="cache_size=500MB"
table_config="type=file"
icount=500000
run_time=120
populate_threads=1
threads=((count=8,reads=1))
In most cases, where the workload is the only interesting phase, the populate phase can be performed once and the workload phase run repeatedly (for more information, see the wtperf create
configuration variable).
The conn_config
configuration supports setting any WiredTiger connection configuration value. This is commonly used to configure statistics with regular reports, to obtain more information from the run:
conn_config="cache_size=20G,statistics=(fast,clear),statistics_log=(wait=600)"
report_interval=5
Note quoting must be used when passing values to Wiredtiger configuration, as opposed to configuring the wtperf
utility itself.
The table_config
configuration supports setting any WiredTiger object creation configuration value, for example, the above test can be converted to using an LSM store instead of a B+tree store, with additional LSM configuration, by changing conn_config
to:
table_config="lsm=(chunk_size=5MB),type=lsm,os_cache_dirty_max=16MB"
More complex workloads can be configured by creating more threads doing inserts and updates as well as reads. For example, to configure two inserting threads two threads doing a mixture of inserts, reads and updates:
threads=((count=2,inserts=1),(count=2,inserts=1,reads=1,updates=1))
Example wtperf
configuration files can be found in the bench/wtperf/runners/
directory.
There are also a number of command line arguments that can be passed to wtperf:
- -C config
- Specify configuration strings for the wiredtiger_open function. This argument is additive to the
conn_config
parameter in the configuration file.
- -h directory
- Specify a database home directory. The default is
./WT_TEST.
- -m monitor_directory
- Specify a directory for all monitoring related files. The default is the database home directory.
- -O config_file
- Specify the configuration file to run.
- -o config
- Specify configuration strings for the
wtperf
program. This argument will override settings in the configuration file.
- -T config
- Specify configuration strings for the WT_SESSION::create function. This argument is additive to the
table_config
parameter in the configuration file.
Monitoring wtperf
Like all WiredTiger applications, the wtperf
command can be configured with statistics logging.
In addition to statistics logging, wtperf
can monitor performance and operation latency times. Monitoring is enabled using the sample_interval
configuration. For example to record information every 10 seconds, set the following on the command line or add it to the wtperf
configuration file:
Enabling monitoring causes wtperf
to create a file monitor
in the database home directory (or another directory as specified using the -m
option to wtperf
).
The following example shows how to run the medium-btree.wtperf
configuration with monitoring enabled, and then generate a graph.
# Change into the WiredTiger directory.
cd wiredtiger
# Configure and build WiredTiger if not already built.
./configure && make
# Remove and re-create the run directory.
rm -rf WTPERF_RUN && mkdir WTPERF_RUN
# Run the medium-btree.wtperf workload, sampling performance every 5 seconds.
bench/wtperf/wtperf \
-h WTPERF_RUN \
-o sample_interval=5 \
-O bench/wtperf/runners/medium-btree.wtperf
Wtperf configuration options
The following is a list of the currently available wtperf
configuration options:
- backup_interval (unsigned int, default=0)
- backup the database every interval seconds during the workload phase, 0 to disable
- checkpoint_interval (unsigned int, default=120)
- checkpoint every interval seconds during the workload phase.
- checkpoint_stress_rate (unsigned int, default=0)
- checkpoint every rate operations during the populate phase in the populate thread(s), 0 to disable
- checkpoint_threads (unsigned int, default=0)
- number of checkpoint threads
- conn_config (string, default="create,statistics=(fast),statistics_log=(json,wait=1)")
- connection configuration string
- close_conn (boolean, default=true)
- properly close connection at end of test. Setting to false does not sync data to disk and can result in lost data after test exits.
- compact (boolean, default=false)
- post-populate compact for LSM merging activity
- compression (string, default="none")
- compression extension. Allowed configuration values are: 'none', 'lz4', 'snappy', 'zlib', 'zstd'
- create (boolean, default=true)
- do population phase; false to use existing database
- database_count (unsigned int, default=1)
- number of WiredTiger databases to use. Each database will execute the workload using a separate home directory and complete set of worker threads
- drop_tables (boolean, default=false)
- Whether to drop all tables at the end of the run, and report time taken to do the drop.
- in_memory (boolean, default=false)
- Whether to create the database in-memory.
- icount (unsigned int, default=5000)
- number of records to initially populate. If multiple tables are configured the count is spread evenly across all tables.
- max_idle_table_cycle (unsigned int, default=0)
- Enable regular create and drop of idle tables. Value is the maximum number of seconds a create or drop is allowed before aborting or printing a warning based on max_idle_table_cycle_fatal setting.
- max_idle_table_cycle_fatal (boolean, default=false)
- print warning (false) or abort (true) of max_idle_table_cycle failure.
- index (boolean, default=false)
- Whether to create an index on the value field.
- insert_rmw (boolean, default=false)
- execute a read prior to each insert in workload phase
- key_sz (unsigned int, default=20)
- key size
- log_partial (boolean, default=false)
- perform partial logging on first table only.
- log_like_table (boolean, default=false)
- Append all modification operations to another shared table.
- min_throughput (unsigned int, default=0)
- notify if any throughput measured is less than this amount. Aborts or prints warning based on min_throughput_fatal setting. Requires sample_interval to be configured
- min_throughput_fatal (boolean, default=false)
- print warning (false) or abort (true) of min_throughput failure.
- max_latency (unsigned int, default=0)
- notify if any latency measured exceeds this number of milliseconds.Aborts or prints warning based on min_throughput_fatal setting. Requires sample_interval to be configured
- max_latency_fatal (boolean, default=false)
- print warning (false) or abort (true) of max_latency failure.
- pareto (unsigned int, default=0)
- use pareto distribution for random numbers. Zero to disable, otherwise a percentage indicating how aggressive the distribution should be.
- populate_ops_per_txn (unsigned int, default=0)
- number of operations to group into each transaction in the populate phase, zero for auto-commit
- populate_threads (unsigned int, default=1)
- number of populate threads, 1 for bulk load
- pre_load_data (boolean, default=false)
- Scan all data prior to starting the workload phase to warm the cache
- random_range (unsigned int, default=0)
- if non zero choose a value from within this range as the key for insert operations
- random_value (boolean, default=false)
- generate random content for the value
- range_partition (boolean, default=false)
- partition data by range (vs hash)
- read_range (unsigned int, default=0)
- read a sequential range of keys upon each read operation. This value tells us how many keys to read each time, or an upper bound on the number of keys read if read_range_random is set.
- read_range_random (boolean, default=false)
- if doing range reads, select the number of keys to read in a range uniformly at random.
- readonly (boolean, default=false)
- reopen the connection between populate and workload phases in readonly mode. Requires reopen_connection turned on (default). Requires that read be the only workload specified
- reopen_connection (boolean, default=true)
- close and reopen the connection between populate and workload phases
- report_interval (unsigned int, default=2)
- output throughput information every interval seconds, 0 to disable
- run_ops (unsigned int, default=0)
- total insert, modify, read and update workload operations
- run_time (unsigned int, default=0)
- total workload seconds
- sample_interval (unsigned int, default=0)
- performance logging every interval seconds, 0 to disable
- sample_rate (unsigned int, default=50)
- how often the latency of operations is measured. One for every operation, two for every second operation, three for every third operation etc.
- scan_icount (unsigned int, default=0)
- number of records in scan tables to populate
- scan_interval (unsigned int, default=0)
- scan tables every interval seconds during the workload phase, 0 to disable
- scan_pct (unsigned int, default=10)
- percentage of entire data set scanned, if scan_interval is enabled
- scan_table_count (unsigned int, default=0)
- number of separate tables to be used for scanning. Zero indicates that tables are shared with other operations
- select_latest (boolean, default=false)
- in workloads that involve inserts and another type of operation, select the recently inserted records with higher probability
- sess_config (string, default="")
- session configuration string
- session_count_idle (unsigned int, default=0)
- number of idle sessions to create. Default 0.
- table_config (string, default="key_format=S,value_format=S,type=file,exclusive=true,leaf_value_max=64MB,memory_page_max=10m, split_pct=90,checksum=on")
- table configuration string
- table_count (unsigned int, default=1)
- number of tables to run operations over. Keys are divided evenly over the tables. Cursors are held open on all tables. Default 1, maximum 99999.
- table_count_idle (unsigned int, default=0)
- number of tables to create, that won't be populated. Default 0.
- threads (string, default="")
- workload configuration: each 'count' entry is the total number of threads, and the 'insert', 'modify', 'read' and 'update' entries are the ratios of insert, modify, read and update operations done by each worker thread; If a throttle value is provided each thread will do a maximum of that number of operations per second; multiple workload configurations may be specified per threads configuration; for example, a more complex threads configuration might be 'threads=((count=2,reads=1)(count=8,reads=1,inserts=2,updates=1))' which would create 2 threads doing nothing but reads and 8 threads each doing 50% inserts and 25% reads and updates. Allowed configuration values are 'count', 'throttle', 'inserts', 'reads', 'read_range', 'modify', 'modify_delta', 'modify_distribute', 'modify_force_update', 'updates', 'update_delta', 'truncate', 'truncate_pct' and 'truncate_count'. There are also behavior modifiers, supported modifiers are 'ops_per_txn'
- tiered (string, default="none")
- tiered extension. Allowed configuration values are: 'none', 'dir_store', 's3'
- tiered_flush_interval (unsigned int, default=0)
- Call flush_tier every interval seconds during the workload phase. We recommend this value be larger than the checkpoint_interval. 0 to disable. The 'tiered_extension' must be set to something other than 'none'.
- transaction_config (string, default="")
- WT_SESSION.begin_transaction configuration string, applied during the populate phase when populate_ops_per_txn is nonzero
- table_name (string, default="test")
- table name
- truncate_single_ops (boolean, default=false)
- Implement truncate via cursor remove instead of session API
- value_sz_max (unsigned int, default=1000)
- maximum value size when delta updates/modify operations are present. Default disabled
- value_sz_min (unsigned int, default=1)
- minimum value size when delta updates/modify operations are present. Default disabled
- value_sz (unsigned int, default=100)
- value size
- verbose (unsigned int, default=1)
- verbosity
- warmup (unsigned int, default=0)
- How long to run the workload phase before starting measurements