cmdr2's notes

Continued in Part 2, where I figured out how to include the required libraries in the wheel.

Spent all of yesterday trying to compile pytorch with the compile-time PYTORCH_ROCM_ARCH=gfx803 environment variable.

tl;dr - In Part 1, I compiled wheels for PyTorch with ROCm, in order to add support for older AMD cards like RX 480. I managed to compile the wheels, but the wheel doesn't include the required ROCm libraries. I figured that out in Part 2.

The intention was to build ROCm 6.2 wheels for torch 2.4 with support for older AMD cards (like RX 480) that don't work with the official torch binary wheels. This is supposed to work (another).

If this worked, I could've hosted the wheel for users of Easy Diffusion (or torchruntime) with older AMD GPUs, without requiring them to install ROCm separately on their PCs.

Compilation was successful, but I wasn't able to get the compiled wheels to include the required libraries like libMIOpen. The compiled wheel was around 300 MB, while the official torch+rocm wheels are nearly 3 GB. A diff of the two wheels using unzip -Z1 shows that's because of the missing libraries.

Edit: Figured this out in Part 2.

I went through the builder code at the torch repo, as well as the deprecated pytorch-builder repo, but couldn't get this to include the libraries. I even tried auditwheel, but that failed with a "very-recent version of stdlib" error (something like that).

In any case, I followed this guide for compiling torch for ROCm.

Notes:

1. Ensure your PC has atleast 120 GB of free disk space.

2. On a Windows host, use this command to start the container: docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --ipc=host --shm-size 8G rocm/pytorch:latest-base

3. If necessary, create a uv venv if you need to build for a different version of Python.