AMD RDNA4 Architecture Speculation

NVIDIA definitely runs tensor and fp32 ops concurrently, especially now with their tensor cores busy almost 100% of the time (doing upscaling, frame generation, denoising, HDR post processing, and in the future neural rendering).

Latest NVIDIA generations have become exceedingly better at mixing all 3 workloads (tensor+ray+fp32) concurrently, I read somewhere (I can't find the source now) that ray tracing + tensor are the most common concurrent ops, followed by ray tracing + fp32/tensor + fp32.


The way I read this it seams that while the workloads are executed concurrently, they are still not dispatched concurrently (unlike on some other architectures). So some pipes will be underutilized.
 
Back
Top