Fedora 40 Looks To Ship AMD ROCm 6 For End-To-End Open-Source GPU Acceleration

ylai@lemmy.ml · 6 months ago

Fedora 40 Looks To Ship AMD ROCm 6 For End-To-End Open-Source GPU Acceleration

Secret300@sh.itjust.works · 6 months ago

What is “end-to-end GPU Acceleration”? Like for playing back video? Or for rendering stuff like in blender

IHeartBadCode@kbin.social · 6 months ago

Data science term. Means everything runs inside the GPU entirely. No CPU or system RAM outside of the (usually Python) interface that started, monitors, and collects the result of the job.

ROCm is AMD’s solution to CUDA that covers for nVidia.

Codilingus@sh.itjust.works · 6 months ago

For years I’ve wondered what ROCm was, too lazy to figure it out. Thank you for this!

IHeartBadCode@kbin.social · 6 months ago

Both are vendor specific implementations of processing on GPUs. This is in opposition to open standards like OpenCL, which a lot of the exascale big boys out there mostly use.

nVidia spent a lot of cash on “outreach” to get CUDA into a lot of various packages in R, python, and what not. That did a lot of displacement from OpenCL stuff. These libraries are what a lot of folks spin up on as most of the leg work is done for them in the library. With the exascale rigs, you literally have a team that does nothing but code very specific things on the machine in front of them, so yeah, they go with the thing that is the most portable, but doesn’t exactly yield libraries for us mere mortals to use.

AMD has only recently had the cash to start paying folks to write libs for their stuff. So were starting to see it come to python libs and what not. Likely, once it becomes a fight of CUDA v ROCm, people will start heading back over to OpenCL. The “worth it” for vendor lock-in for CUDA and ROCm will diminish more and more over time. But as it stands, with CUDA you do get a good bit of “squeezing that extra bit of steam out of your GPU” by selling your soul to nVidia.

That last part also plays into the “why” of CUDA and ROCm. If you happen to NOT have a rig with 10,000 GPUs, then the difference between getting 98% of your GPU and 99.999% of your GPU means a lot to you. If you do have 10,000 GPUs, having like a 1% inefficiency is okay, you’ve got 10,000 GPUs the 1% loss is barely noticeable and not worth it to lose portability with OpenCL.

AlmightySnoo 🐢🇮🇱🇺🇦@lemmy.world · edit-2 6 months ago

HIP is amazing. For everyone saying “nah it can’t be the same, CUDA rulez”, just try it, it works on NVidia GPUs too (there are basically macros and stuff that remap everything to CUDA API calls) so if you code for HIP you’re basically targetting at least two GPU vendors. ROCm is the only framework that allows me to do GPGPU programming in CUDA style on a thin laptop sporting an AMD APU while still enjoying 6 to 8 hours of battery life when I don’t do GPU stuff. With CUDA, in terms of mobility, the only choices you get are a beefy and expensive gaming laptop with a pathetic battery life and heating issues, or a light laptop + SSHing into a server with an NVidia GPU.

Molecular0079@lemmy.world · 6 months ago

The problem with ROCm is that its very unstable and a ton of applications break on it. Darktable only renders half an image on my Radeon 680M laptop. HIP in Blender is also much slower than Optix. We’re still waiting on HIP-RT.