1 min readfrom Machine Learning

ROCm Status in mid 2026 [D]

Our take

As of mid-2026, there are growing indications that ROCm is becoming a viable option for inference tasks, yet clarity on its effectiveness for training remains sparse. Users considering a transition from NVIDIA’s RTX 3090s to AMD’s RX 7900XTX may find compelling performance metrics, particularly in FP16 throughput. While PyTorch documentation suggests full ROCm support, firsthand accounts on its performance compared to CUDA are limited.

Hey folks

I'm starting to hear that ROCm works fine for inference now. But, I've not seen any reports on how viable it is for training. I have a couple of RTX 3090s I use for prototyping models, but I'm considering switching to a pair of RX7900XTX instead. On paper at least, the RX7900XTX can output about 4 times the throughput at FP16 with a similar power draw, VRAM, and cost.

Based on PyTorch docs, it seems like ROCm is now fully supported, but I'm struggling to find user reports on how well PyTorch runs with ROCm instead of CUDA.

How viable is it to switch over to ROCm at the moment? Is it at the "it just works" stage yet? Or is the AMD ecosystem still significantly behind CUDA?

submitted by /u/QuantumQuokka
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#natural language processing for spreadsheets#generative AI for data analysis#Excel alternatives for data analysis#financial modeling with spreadsheets#rows.com#cloud-based spreadsheet applications#ROCm#RX7900XTX#training#PyTorch#CUDA#inference#FP16#viability#RTX 3090#throughput#user reports#support#software support#VRAM