AMD RDNA3 Specifications Discussion Thread

7600XT reviews are out.


The card is ~5% faster than 4060 for ~10% higher price. 16GBs of VRAM is an added bonus vs the ~10% performance deficit vs 4060 in RT.
Compared to 7600 its +10% w/o RT but that jumps to +15-25% with RT.

Seems like a fine product for those who want 16GBs for cheap but don't want to mess with Intel products.
 
7600XT reviews are out.


The card is ~5% faster than 4060 for ~10% higher price. 16GBs of VRAM is an added bonus vs the ~10% performance deficit vs 4060 in RT.
Compared to 7600 its +10% w/o RT but that jumps to +15-25% with RT.

Seems like a fine product for those who want 16GBs for cheap but don't want to mess with Intel products.
How do they get 16GB on 128bit? I thought GDDR6 topped out at 16Gb. Clamshell?
 


 
So, today Radeon RX 7900 GRE has been made an official product for retail 549 USD MSRP (609 Euro in germany including taxes).
Specifications are no surprise since it was already available in OEM-space: 80 CU/ 40 WGP, 256 Bit memory interface with 16 GByte of GDDR6 memory and 576 GB/s. 64 MiB L3 „infinity“ cache.

All in all a nice addition to the portfolio, I think. Often not too far behind the 7900 XT.
 
Also 7700XT gets official price reduction down to $420 from $450.
Nice. If it keeps coming down it could end up like the 6700XT, aka the best value card of the generation. 6700XT was originally $480 and came down to $350 and even less for quite a while before they were gone.
 
Nice. If it keeps coming down it could end up like the 6700XT, aka the best value card of the generation. 6700XT was originally $480 and came down to $350 and even less for quite a while before they were gone.
None of the N3x parts have that much time left to live.
 
Above is the recently open sourced documentation on the MES hardware block and it's firmware. Yes, it's the same subject source of much controversy and drama for the ML startup ...

Apparently, a former AMD intern disclosed that the Work Graphs API is implemented on top of this HW unit. There are also a curious last bit in their 1st compute shader program settings register where you can change the scheduling policy according their AMDGPU kernel code. Their "oldest first policy" likely refers to the MES' documentation about the round robin scheduling algorithm ...

The MES_SCH_API_ADD_QUEUE API has a convenient exclusively_scheduled bit where you can enable cooperative launch kernels much like in CUDA for the purposes of doing grid level synchronization between the different concurrently scheduled waves/warps. This API may possibly be a part how they implement work graphs on their HW and is potentially related to the register mentioned in AMDGPU kernel code. If the last bit in the register is set to the value of 1 in combination with the exclusively_scheduled bit in the API, the HW can then enable the condition of "forward progress" for persistent threads ...

The MES_SCH_API_SET_GANG_SUBMIT API is very likely used to implement amplification/task shaders in a mesh shading pipeline based on a driver developer's account of the existence of said functionality. As known previously, amplification shaders are emulated with compute shaders while mesh shaders are implemented with primitive shaders on AMD HW. Compute shaders are executed on a compute queue and graphics shaders are executed on a graphics queue so due to the nature of the mesh shading pipeline being implemented with two queues, we have this "queue pairing" API with the intention of supporting a single combined submission of two different shaders that feature mutually exclusive (amplification/compute vs primitive/graphics) HW pipelines between the two differing queues hence the "gang submit" API ...
 
Back
Top