Monitor node metrics

Substrate exposes metrics about the operation of your network. For example, you can collect information about how many peers your node is connected to and how much memory your node is using. To visualize these metrics, you can use tools like Prometheus and Grafana. This tutorial demonstrates how to use Grafana and Prometheus to scrape and visualize these types of node metrics.

A possible architecture could look like:

+-----------+                     +-------------+                                                              +---------+
| Substrate |                     | Prometheus  |                                                              | Grafana |
+-----------+                     +-------------+                                                              +---------+
      |               -----------------\ |                                                                          |
      |               | Every 1 minute |-|                                                                          |
      |               |----------------| |                                                                          |
      |                                  |                                                                          |
      |        GET current metric values |                                                                          |
      |<---------------------------------|                                                                          |
      |                                  |                                                                          |
      | `substrate_peers_count 5`        |                                                                          |
      |--------------------------------->|                                                                          |
      |                                  | --------------------------------------------------------------------\    |
      |                                  |-| Save metric value with corresponding time stamp in local database |    |
      |                                  | |-------------------------------------------------------------------|    |
      |                                  |                                         -------------------------------\ |
      |                                  |                                         | Every time user opens graphs |-|
      |                                  |                                         |------------------------------| |
      |                                  |                                                                          |
      |                                  |       GET values of metric `substrate_peers_count` from time-X to time-Y |
      |                                  |<-------------------------------------------------------------------------|
      |                                  |                                                                          |
      |                                  | `substrate_peers_count (1582023828, 5), (1582023847, 4) [...]`           |
      |                                  |------------------------------------------------------------------------->|
      |                                  |                                                                          |
Reproduce diagram

Go to: https://textart.io/sequence

object Substrate Prometheus Grafana
note left of Prometheus: Every 1 minute
Prometheus->Substrate: GET current metric values
Substrate->Prometheus: `substrate_peers_count 5`
note right of Prometheus: Save metric value with corresponding time stamp in local database
note left of Grafana: Every time user opens graphs
Grafana->Prometheus: GET values of metric `substrate_peers_count` from time-X to time-Y
Prometheus->Grafana: `substrate_peers_count (1582023828, 5), (1582023847, 4) [...]`

Before you begin

Before you begin, verify the following:

Tutorial objectives

By completing this tutorial, you will accomplish the following objectives:

  • Install Prometheus and Grafana.
  • Configure Prometheus to capture a time series for your Substrate node.
  • Configure Grafana to visualize the node metrics collected using the Prometheus endpoint.

Install Prometheus and Grafana

For testing and demonstration purposes, you should download the compiled bin programs for Prometheus and Grafana rather than building the tools yourself or using a Docker image. Use the following links to download the appropriate binaries for your architecture. This tutorials assumes you are using the compiled binaries in a working directory.

To install the tools for this tutorial:

  1. Open a browser on your computer.
  2. Download the appropriate precompiled binary for Prometheus from prometheus download.
  3. Uncompress and extract the download archive into a working folder.

    gunzip prometheus-2.35.0.darwin-amd64.tar.gz && tar -xvf prometheus-2.35.0.darwin-amd64.tar
  4. Navigate to Grafana OSS download.
  5. Select the appropriate precompiled binary for your architecture.
  6. Open a terminal shell on your computer and run the appropriate command to install on your architecture.

Start a Substrate node

Substrate exposes an endpoint that serves metrics in the Prometheus exposition format available on port 9615. You can change the port with --prometheus-port <PORT> and enable it to be accessed over an interface other than local host with --prometheus-external.

# Optionally add the `--prometheus-port <PORT>`
./target/release/node-template --dev

Configure Prometheus to scrape your Substrate node

In the working directory where you installed Prometheus, you will find a prometheus.yml configuration file. You can modify this file—or create a custom file—to configure Prometheus to scrape the exposed endpoint by adding it to the targets array. If you modify the default configuration file, here is what will be different:

# --snip--

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "substrate_node"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    # Override the global default and scrape targets from this job every 5 seconds.
    # ** NOTE: you want to have this *LESS THAN* the block time in order to ensure
    # ** that you have a data point for every block!
    scrape_interval: 5s

    static_configs:
      - targets: ["localhost:9615"]

You want to have scrape_interval less than the block time in order to ensure that you have a data point for every block!

Now you can start a Prometheus instance with the modified prometheus.yml configuration file.

Presuming you downloaded the binary, cd into the working directory and run the following command:

# specify a custom config file instead if you made one here:
./prometheus --config.file prometheus.yml

leave this process running.

Check all Prometheus metrics

In a new terminal, you can do a quick status check on prometheus by running the following command:

curl localhost:9615/metrics

This command should return output similar to the following:

# HELP substrate_block_height Block height info of the chain
# TYPE substrate_block_height gauge
substrate_block_height{status="best"} 7
substrate_block_height{status="finalized"} 4
# HELP substrate_build_info A metric with a constant '1' value labeled by name, version
# TYPE substrate_build_info gauge
substrate_build_info{name="available-vacation-6791",version="2.0.0-4d97032-x86_64-linux-gnu"} 1
# HELP substrate_database_cache_bytes RocksDB cache size in bytes
# TYPE substrate_database_cache_bytes gauge
substrate_database_cache_bytes 0
# HELP substrate_finality_grandpa_precommits_total Total number of GRANDPA precommits cast locally.
# TYPE substrate_finality_grandpa_precommits_total counter
substrate_finality_grandpa_precommits_total 31
# HELP substrate_finality_grandpa_prevotes_total Total number of GRANDPA prevotes cast locally.
# TYPE substrate_finality_grandpa_prevotes_total counter
substrate_finality_grandpa_prevotes_total 31
#
# --snip--
#

Alternatively, you can open same URL (http://localhost:9615/metrics) in a browser to view all available metric data.

Visualizing Prometheus metrics with Grafana

After you start Grafana, you can navigate to it in a browser.

  1. Open a browser and navigate to the port Grafana uses. By default, the URL is https://localhost:3000/.
  2. Log in using the default user admin and password admin and navigate to the data sources page at localhost:3000/datasources.

You then need to select a Prometheus data source type and specify where Grafana needs to look for it.

The Prometheus port Grafana needs is NOT the one you set in the prometheus.yml file (http://localhost:9615) for where your node is publishing its data.

With both the Substrate node and Prometheus running, configure Grafana to look for Prometheus on its default port http://localhost:9090 or the port you configured if you customized it.

  1. Click Save & Test to ensure that you have the data source set correctly. Now you can configure a new dashboard.

Template Grafana Dashboard

If you would like a basic dashboard to start here is a template example that you can Import in Grafana to get basic information about your node:

Grafana Dashboard

If you want to create your own dashboard, see the prometheus docs for Grafana.

If you create a custom dashboard, consider uploading it to the Grafana dashboards. The public Substrate node template dashboard is available for download from Grafana dashboards. You can let the Substrate builder community know your dashboard exists by listing it in the Awesome Substrate repository.

Where to go next