Skip to content

GPU Monitoring

Beszel can monitor GPU usage, temperature, and power draw.

AMD GPUs

Work in progress

AMD has deprecated rocm-smi in favor of amd-smi. The agent works with rocm-smi on Linux, but hasn't been updated to work with amd-smi yet.

Beszel uses rocm-smi to monitor AMD GPUs. This must be available on the system, and you must use the binary agent (not the Docker agent).

Make sure rocm-smi is accessible

Installing rocm-smi-lib on Arch and Debian places the rocm-smi binary in /opt/rocm. If this isn't in the PATH of the user running beszel-agent, symlink to /usr/local/bin:

bash
sudo ln -s /opt/rocm/bin/rocm-smi /usr/local/bin/rocm-smi

Nvidia GPUs

Docker agent

Make sure NVIDIA Container Toolkit is installed on the host system.

Use henrygd/beszel-agent-nvidia and add the following deploy block to your docker-compose.yml.

yaml
beszel-agent:
  image: henrygd/beszel-agent-nvidia
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            count: all
            capabilities:
              - utility

Binary agent

You must have nvidia-smi available on the system.

If it doesn't work, you may need to allow access to your devices in the service configuration. See discussion #563 for more information.

ini
[Service]
DeviceAllow=/dev/nvidiactl rw
DeviceAllow=/dev/nvidia0 rw
# If you have multiple GPUs, make sure to allow all of them
DeviceAllow=/dev/nvidia1 rw
DeviceAllow=/dev/nvidia2 rw
bash
systemctl daemon-reload
systemctl restart beszel-agent

Nvidia Jetson

You must use the binary agent and have tegrastats installed.

The henrygd/beszel-agent-nvidia image likely doesn't work, but I can't test it to confirm. Let me know one way or the other if you try it 😃.

Intel GPUs

Intel GPUs are not currently supported as there doesn't seem to be a straightforward utility like nvidia-smi to get utilization and memory usage.

We may add support for tracking usage of video and 3D rendering engines in the future with intel-gpu-top.

Please see issue #262 for more information.

Released under the MIT License