Benchmark | Substrate

Substrate and FRAME provide a flexible framework for developing custom logic for your blockchain. This flexibility enables you to design complex and interactive pallets and implement sophisticated runtime logic. However, determining the appropriate weight to assign to the functions in your pallets can be a difficult task. Benchmarking enables you to measure the time it takes to execute different functions in the runtime and under different conditions. If you use benchmarking to assign accurate weights to function calls, you can prevent your blockchain from being overloaded and unable to produce blocks or vulnerable to denial of service (DoS) attacks by malicious actors.

Why benchmark a pallet

It is important to understand the computational resources required to execute different functions—including runtime functions like on_initialize and verify_unsigned—to keep the runtime safe and to enable the runtime to include or exclude transactions based on the resources available.

The ability to include or exclude transactions based on available resources ensures that the runtime can continue to produce and import blocks without service interruptions. For example, if you have a function call that requires particularly intensive computation, executing the call might exceed the maximum time allowed for producing or importing a block, disrupting the block handling process or stopping blockchain progress altogether. Benchmarking helps you validate that the execution time required for different functions is within reasonable boundaries.

Similarly, a malicious user might attempt to disrupt network service by repeatedly executing a function call that requires intensive computation or that doesn't accurately reflect the computation it requires. If the cost for executing a function call doesn't accurately reflect the computation involved, there's no incentive to deter a malicious user from attacking the network. Because benchmarking helps you evaluate the weight associated with executing transactions, it also helps you to determine appropriate transaction fees. Based on your benchmarks, you can set fees that represent the resources consumed by executing specific calls on the blockchain.

Developing a linear model

At a high level, benchmarking requires you to perform the following steps:

Write custom benchmarking logic that executes a specific code path for a function.
Execute the benchmark logic in the WebAssembly execution environment on a specific set of hardware and with a specific runtime configuration.
Execute the benchmark logic across a controlled range of possible values that might affect the execution time a function requires.
Execute the benchmark multiple times for each component in a function to isolate and remove outliers.

From the results generated by executing the benchmark logic, the benchmarking tool creates a linear model of the function across all of its components. The linear model for a function enables you to estimate how long it takes to execute a specific code path and to make informed decisions without actually spending any significant resources at runtime. Benchmarking assumes all transactions have linear complexity because higher complexity functions are considered to be dangerous to the runtime as the weight of these functions may explode as the runtime state or input becomes too complex.

Benchmarking and weight

As discussed in Transactions, weights, and fees, Substrate-based chains use the concept of weight to represent the time it takes to execute the transactions in a block. The time required to execute any particular call in a transaction depends on a several factors, including the following:

Computational complexity.
Storage complexity.
Database read and write operations required.
Hardware used.

To calculate an appropriate weight for a transaction, you can use benchmark parameters to measure the time it takes to execute the function calls on different hardware, using different variable values, and repeated multiple times. You can then use the results of the benchmarking tests to establish an approximate worst case weight to represent the resources required to execute each function call and each code path. Fees are then based on the worst case weight. If the actual call performs better than the worst case, the weight is adjusted and any excess fees can be returned.

Because weight is a generic unit of measurement based on computation time for a specific physical machine, the weight of any function can change based on the specific hardware used for benchmarking.

By modeling the expected weight of each runtime function, the blockchain is able to calculate how many transactions or system level calls it can execute within a certain period of time.

Within FRAME, each function call that can be dispatched must have a #[weight] annotation that can return the expected weight for the worst case scenario execution of that function given its inputs. The benchmarking framework automatically generates a file with those formulas for you.

Benchmarking tools

The benchmarking framework provides tools that help you add, test, run, and analyze benchmarks for the functions in the runtime. The benchmarking tools that help you determine the time it takes to execute function calls include the following:

Benchmark macros to help you write, test, and add runtime benchmarks.
Linear regression analysis functions for processing benchmark data.
Command-line interface (CLI) to enable you to execute benchmarks on your node.

The end-to-end benchmarking pipeline is disabled by default when compiling a node. If you want to run benchmarks, you need to compile a node with the runtime-benchmarks Rust feature flag.

Writing benchmarks

Writing a runtime benchmark is similar to writing a unit test for your pallet. Like unit tests, benchmarks must execute specific logical paths in your code. In unit tests, you check the code for specific success and failure results. For benchmarks, you want to execute the most computationally intensive path.

In writing benchmarks, you should consider the specific conditions—such as storage or runtime state—that might affect the complexity of the function. For example, if triggering more iterations in a for loop increases the number of database read and write operations, you should set up a benchmark that triggers this condition to get a more accurate representation of how the function would perform.

If a function executes different code paths depending on user input or other conditions, you might not know which path is the most computationally intensive. To help you see where complexity in the code might become unmanageable, you should create a benchmark for each possible execution path. The benchmarks can help you identify places in the code where you might want to enforce boundaries—for example, by limiting the number of elements in a vector or limiting the number of iterations in a for loop—to control how users interact with your pallet.

You can find examples of end-to-end benchmarks in all of the prebuilt FRAME pallets.

Testing benchmarks

You can test benchmarks using the same mock runtime that you created for unit testing your pallet. The benchmarking macro you use in your benchmarking.rs module automatically generates test functions for you. For example:

fn test_benchmark_[benchmark_name]<T>::() -> Result<(), &'static str>

You can add the benchmark functions to a unit test and ensure that the result of the function is Ok(()).

Verify blocks

In general, you only need to check that a benchmark returned Ok(()) because that result indicates that the function was executed successfully. However, you can optionally include a verify block with your benchmarks if you want to verify any final conditions, such as the final state of your runtime. The additional verify blocks don't affect the results of your final benchmarking process.

Run the unit tests with benchmarks

To run the benchmarking tests, you need to specify the package to test and enable the runtime-benchmarks feature. For example, you can test the benchmarks for the Balances pallet by running the following command:

cargo test --package pallet-balances --features runtime-benchmarks

Adding benchmarks

The benchmarks included with each pallet are not automatically added to your node. To execute these benchmarks, you need to implement the frame_benchmarking::Benchmark trait. You can see an example of how to do this in the Substrate node.

Assuming there are already some benchmarks set up on your node, you just need to add the pallet to the define_benchmarks! macro:

#[cfg(feature = "runtime-benchmarks")]
mod benches {
	define_benchmarks!(
		[frame_benchmarking, BaselineBench::<Runtime>]
		[pallet_assets, Assets]
		[pallet_babe, Babe]
    ...
    [pallet_mycustom, MyCustom]
    ...

After you have added your pallet, compile your node binary with the runtime-benchmarks feature flag. For example:

cd bin/node/cli
cargo build --profile=production --features runtime-benchmarks

The production profile applies various compiler optimizations.
These optimizations slow down the compilation process a lot.
If you are just testing things out and don't need final numbers, use the --release command-line option instead of the production profile.

Running benchmarks

After you have compiled a node binary with benchmarks enabled, you need to execute the benchmarks. If you used the production profile to compile the node, you can list the available benchmarks by running the following command:

./target/production/node-template benchmark pallet --list

Benchmark all functions in all pallets

To execute all benchmarks for the runtime, you can run a command similar to the following:

./target/production/node-template benchmark pallet \
    --chain dev \
    --execution=wasm \
    --wasm-execution=compiled \
    --pallet "*" \
    --extrinsic "*" \
    --steps 50 \
    --repeat 20 \
    --output pallets/all-weight.rs

This command creates an output file—in this case, a file named all-weight.rs—that implements the WeightInfo trait for your runtime.

Benchmark a specific functions in a pallet

To execute the benchmark for a specific function in a specific pallet, you can run a command similar to the following:

./target/production/node-template benchmark pallet \
    --chain dev \
    --execution=wasm \
    --wasm-execution=compiled \
    --pallet pallet_balances \
    --extrinsic transfer \
    --steps 50 \
    --repeat 20 \
    --output pallets/transfer-weight.rs

This command creates an output file for the selected pallet—for example, transfer-weight.rs—that implements the WeightInfo trait for the pallet_balances pallet.

Use a template to format benchmarks

The benchmarking command-line interface uses a Handlebars template to format the final output file. You can optionally pass the --template command-line option to specify a custom template instead of the default. Within the template, you have access to all the data provided by the TemplateData struct in the benchmarking command-line interface.

There are some custom Handlebars helpers included with the output generation:

underscore: Add an underscore to every 3rd character from the right of a string. Primarily to be used for delimiting large numbers.
join: Join an array of strings into a space-separated string for the template. Primarily to be used for joining all the arguments passed to the CLI.

To get a full list of benchmark subcommands, run:

./target/production/node-template benchmark --help

To get a full list of available options for the benchmark pallet subcommand, run:

./target/production/node-template benchmark pallet --help