AMD Radeon Instinct MI25

From ArchWiki

This page describes the steps necessary to perform GPGPU on the AMD Radeon Instinct MI25 (and other gfx900 vega10 GPU's).

BIOS and cooling

The MI25 is a power hungry, passively cooled accelerator card that often exhibits incompatibility with consumer level hardware and has no video out by default. To remedy this, we can flash a WX9100 BIOS to the MI25, which lowers the power limit from 220W to 170W, enables the Mini DisplayPort hidden behind the PCIe bracket, enables the fan header to be used for active cooling, and allows consumer equipment to boot with the MI25 attached if it would not before.

Depending on your situation, you may be able to flash the BIOS from within the operating system using amdvbflashAUR if you can boot successfully and keep it cool. Alternatively, the BIOS can be flashed quite easily in hardware without the risk of overheating.

The recommended and most widely tested WX9100 BIOS can be downloaded here.

Hardware flashing with a Raspberry Pi

The BIOS chip is located under the backplate and can be flashed with a SOP8 test clip and a Raspberry Pi using flashrom.

Note: Keep in mind that using a test clip provides a less than ideal connection, so it is important to take multiple dumps and compare them before and after flashing, and although in theory we could use a faster spi speed such as 32768, in practice user reports success using 8192. Feel free to adjust this value as needed.

Firstly, we need to connect the MI25 BIOS chip and clip to the Raspberry Pi GPIO pins according to this diagram here. Once we have carefully connected everything and have the Raspberry Pi booted, use the following command to see if the flash is detected.

# flashrom -p linux_spi:dev=/dev/spidev0.0,spispeed=8192

Once the flash has been successfully detected, we must backup the original BIOS before we flash the new one. This serves two purposes, it provides us with a backup to restore to, and confirms that we have a good connection to the BIOS flash chip.

# flashrom -p linux_spi:dev=/dev/spidev0.0,spispeed=8192 -r mi25-dump1.rom
# flashrom -p linux_spi:dev=/dev/spidev0.0,spispeed=8192 -r mi25-dump2.rom
$ sha1sum mi25-dump*.rom

If the checksums match, then we are good to go. If not, try reseating the clip and try again until you get consistent dumps.

You may have noticed that the WX9100 BIOS is 256KiB, while the MI25 BIOS is 1MiB. To remedy this, we create a 768KiB file consisting of zero bytes and append it to the end of our WX9100 BIOS, 218718.rom.

$ truncate -s +768KiB pad.bin
$ cat 218718.rom pad.bin > 218718-padded.rom

After that, we can flash the freshly padded BIOS to the MI25.

# flashrom -p linux_spi:dev=/dev/spidev0.0,spispeed=8192 -w 218718-padded.rom -V

Flashrom should verify a successful flash, but feel free to take another BIOS dump and compare checksums to 218718-padded.rom.

Cooling

The MI25 has a JST-PH 4-pin fan header that actually works under the WX9100 BIOS. Simply purchase a cheap KK254 4-pin (regular 4 pin computer fan) to JST-PH 4-pin (MI25 fan header) adaptor cable. If you shop around, you can find a cheap adapter that lets you power two 4 pin fans from the MI25 (try searching for "Graphics Card Fan Adapter Cable"). To install the adapter cable, the shroud needs to be taken off, and the cable can be routed out through the gap next to the power connector.

Note: While the cooling shroud is off, it is recommended to replace the thermal paste while you are there.

Since the MI25 comes passively cooled, some sort of ducting is required to redirect airflow through the heatsink. 3D printing one of these cooling shrouds, depending on which fan you decide to use, is quite a nice option, although a homemade solution would also suffice.

Note: Since the MI25 is designed to be in a server rack with high airflow, point a second fan at the components on the back of the MI25 if airflow does not already exist there.

ROCm

The MI25 (and other gfx900 GPUs) are deprecated and official support has ended. They were officially supported under ROCm 4, unofficially supported under ROCm 5, and unsupported under ROCm 6. So to install the latest working ROCm stack for gfx900, we first look at the dependencies of rocm-hip-sdk version 5.7.1-2 and miopen-hip version 5.7.1-1.

The factual accuracy of this article or section is disputed.

Reason: Installing Arch Linux packages from the archive will not work with the rest of the Arch Linux system. (Discuss in Talk:AMD Radeon Instinct MI25)
Level Dependencies
Package rocm-hip-sdk
Direct dependencies rocm-hip-sdk rocm-core rocm-hip-libraries rocm-llvm rocm-hip-runtime hipblas hipcub hipfft hipsparse hipsolver miopen-hip rccl rocalution rocblas rocfft rocprim rocrand rocsolver rocsparse rocthrust
Secondary dependencies composable-kernel hip-runtime-amd rocm-clang-ocl rocm-cmake rocm-language-runtime rocminfo
Tertiary dependencies comgr hsa-rocr rocm-smi-lib hsakmt-roct rocm-device-libs rocm-opencl-runtime
Optional dependencies rocm-ml-libraries rocm-ml-sdk rocm-opencl-sdk

To install ROCm 5.7.1 from the Arch Linux Archive, we make use of the downgradeAUR script to select our desired package versions and obtain a functional ROCm environment.

# downgrade rocm-hip-sdk rocm-core rocm-hip-libraries rocm-llvm rocm-hip-runtime hipblas hipcub hipfft hipsparse hipsolver miopen-hip rccl rocalution rocblas rocfft rocprim rocrand rocsolver rocsparse rocthrust composable-kernel hip-runtime-amd rocm-clang-ocl rocm-cmake rocm-language-runtime rocminfo comgr hsa-rocr rocm-smi-lib hsakmt-roct rocm-device-libs rocm-opencl-runtime rocm-ml-libraries rocm-ml-sdk rocm-opencl-sdk

See the for Arch Linux repository for more information about the remaining ROCm packages.

OpenCL

opencl-amdAUR is functional, however it is incompatible with the ROCm 5.7.1 stack we have just installed.

The recommended method to obtain a performant, functional OpenCL sdk and runtime alongside ROCm, is by installing ROCm 5.7.1 and all of its dependencies listed above.

GPGPU accelerated software

The purpose of this section is to list GPGPU accelerated software and how to run them on gfx900 GPUs.

AUTOMATIC1111s Stable Diffusion web UI

A web interface for Stable Diffusion, implemented using Gradio library.

stable-diffusion-web-ui-gitAUR can be used with a little manual configuration after installation. See rabcors comment on the AUR to get it working.

BitCrack

A tool for brute-forcing Bitcoin private keys.

  • Clone repository.
$ git clone https://github.com/brichard19/BitCrack.git
  • Change directory and build for OpenCL.
$ cd BitCrack && make BUILD_OPENCL=1
  • Run clBitCrack.
$ ./bin/clBitCrack 1FshYsUh3mqgsG29XpZ23eLjWV8Ur3VwH 15JhYXn6Mx3oF4Y7PcTAv2wVVAuCFFQNiP 19EEC52krRUK1RkUAEZmQdjTyHT7Gp1TYT

Ollama

Get up and running with large language models.

The factual accuracy of this article or section is disputed.

Reason: Installing Arch Linux packages from the archive will not work with the rest of the Arch Linux system. (Discuss in Talk:AMD Radeon Instinct MI25)

The latest version of ollama to support ROCm 5.7 is 0.1.28. However, ollama 0.1.28 from the Arch Linux Archive is compiled without gfx900 support, so we must:

  • Downgrade ollama.
# downgrade ollama==0.1.28
  • Clone repository.
$ git clone https://github.com/ollama/ollama.git
$ cd ollama && git checkout 21347e1
  • Build for gfx900.
$ AMDGPU_TARGETS=gfx900 go generate ./... && go build .
  • Replace ollamas executable with our freshly compiled binary
# mv ./ollama /usr/bin/ollama
  • Start ollama.service.
  • Perform inference.
$ ollama run tinyllama

PyTorch

Pytorch v2.2.2 is the latest release that supports ROCm 5.7, and can be installed via pip within a python311AUR virtual environment.

(venv)$ pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/rocm5.7

Tensorflow

Tensorflow 2.13.0.570 is the latest ROCm 5.7 compatible release, and can be installed via pip within a python311AUR virtual environment.

(venv)$ pip install tensorflow-rocm==2.13.0.570
Note: When installing jupyterlab, the latest ipython kernel has a dependency conflict with tensorflow 2.13.0.570. Simply install ipython 8.23.0 alongside tensorflow and jupyterlab to continue using the latest jupyterlab.
(venv)$ pip install tensorflow-rocm==2.13.0.570 jupyterlab ipython==8.23.0