Spent most of the day doing some support work for Easy Diffusion, and experimenting with torch-directml for AMD support on Windows.
From the initial experiments, torch-directml seems to work properly with Easy Diffusion. I ran it on my NVIDIA card, and another user ran it on their AMD Radeon RX 7700 XT.
It's 7-10x faster than the CPU, so looks promising. It's 2x slower than CUDA on my NVIDIA card, but users with NVIDIA cards are not the target audience of this change.
I still need to run the full set of automated tests, so there's a chance of some corner scenario breaking.