AMD: RDNA 3 Speculation, Rumours and Discussion

Status
Not open for further replies.
While packaging chiplets is more expensive than monoliths, those monoliths suffer from exponentially worse yields
It is also a rigged comparison towards monoliths, once one considers that 3D packaging solutions can break rectile limit, and that the economies of scale (for foundries) is strong once the first movers prove themselves and the fly wheel gets going. Everyone across the spectrum wants cheap(er) 3D KGDs.

Though Nvidia has large enough margins and a tight grip on key markets to shallow it for at least a generation, I suppose. :)

(PVC is exclusive to the ICU, ahem, Intel Cinematic Universe. Let it be.)
 
Last edited:
Thinking about the pin-outs of the GCDs and MCDs:

The MCDs already, in theory, have to be designed to support 180 degree rotational symmetry, assuming that both GCDs in a SKU have the same layout and assuming that there's more than 2x MCDs required to bridge the GCDs. This would be
  • GCD 1 zone A connecting to GCD 2 zone D
  • GCD 1 zone B to GCD 2 zone C
  • GCD 1 zone C to GCD 2 zone B
  • GCD 1 zone D to GCD 2 zone A
One reason to make a GPU from chiplets is that you can reduce the count of chiplets in some SKUs.

So instead of a SKU that uses 2x GCDs + 4x MCDs, you'd like to be able to make a SKU from one GCD. But the MCDs that are designed to fit on top of a pair of GCDs need to be "supported" by some kind of "blank die" that isn't a GCD.

An alternative is to design the MCDs so that they can be rotated by 90 degrees and still be pin-compatible with the GCD. So now this would be a SKU with 1x GCD and 2x MCDs, with each MCD connecting to two zones on the GCD.

So instead of a single MCD connecting one GCD cache zone (A) on chiplet 1 to a cache zone (D) on GCD chiplet 2, the MCD would connect cache zones A and B on a single GCD. Then a second MCD would connect cache zone D on chiplet 1 to cache zone C on the same GCD.
I solved this problem!

b3da047.png

It's actually quite a simple solution, I just needed a glass of Cabernet Sauvignon to lubricate the neurons...

It amuses me that the red dots in the 2x GCD configuration are reminiscent of the stochastic arrangement of sample points in 4xMSAA of old :cool:
 
Why do I feel like RDNA 3 Navi 31 is gonna be the next Radeon 9700/Pro (R300) ? Maybe cause it's been exactly 2 decades? Maybe I'm excited by the chiplet approach. Maybe I'm excited about the potential for an ultra high-end HBM3 halo product sometime in 2023. Maybe lotta things.
 
The only way it will get positive reviews is if ray tracing performance is on par.

I'm fairly doubtful AMD will improve ray tracing performance enough. It'll probably be a repeat of the tessellation "pain" which took most of a decade to get sorted.

Was tessellation really that bad? It didn’t feel like a decade before AMD caught up. The funny thing with tessellation is that it’s hard to tell when games are using it and how much of it. It’s rarely available as an option in graphics settings. So is it omnipresent or is nobody using it?

Why do I feel like RDNA 3 Navi 31 is gonna be the next Radeon 9700/Pro (R300) ? Maybe cause it's been exactly 2 decades? Maybe I'm excited by the chiplet approach. Maybe I'm excited about the potential for an ultra high-end HBM3 halo product sometime in 2023. Maybe lotta things.

AMD is overdue for a win. They have a lot of ground to make up though. I just hope they come within striking distance on RT so they won’t have to keep promoting half baked implementations in AMD sponsored games.
 
Tessellation performance never amounted to much outside performance in that tiny selection of Gameworks titles which didn't run good on Nvidia hardware of the time either.
 
Last edited:
I just hope they come within striking distance on RT so they won’t have to keep promoting half baked implementations in AMD sponsored games.

What's the matter with IHVs lobbying their own ideal implementations of ray traced effects for ISVs ?


Even Intel employees in the above thread tells every graphics programmer out there to avoid inline RT like the plague which is somewhat ironic given the nature of their profession to customize graphics code ...

No IHVs really likes the idea of having to spend more die space just to implement multiple redundant hardware paths to have optimal performance in all of the different APIs ...
 
What's the matter with IHVs lobbying their own ideal implementations of ray traced effects for ISVs ?


Even Intel employees in the above thread tells every graphics programmer out there to avoid inline RT like the plague which is somewhat ironic given the nature of their profession to customize graphics code ...

That’s a fun thread . Although the poll isn’t really about inline vs callable shading. It’s about doing hit shading in the RT pipeline or in a separate compute shader after recording the hits. I’ve seen signs of the latter while profiling some apps.

Intel’s motivation is obvious. Their sorting hardware will go to waste when doing inline RT and they lose an advantage over the competition. Both Nvidia and Microsoft recommend using callable hit shaders in the general case which also favors Intel.

In the end it’s performance that matters. If developers want the freedom to write ubershaders and optimize for coherency themselves they have to prove they can make it fast on some/all hardware. If the result is that it’s slow on all hardware just to even the playing field that’s not a win for us.

No IHVs really likes the idea of having to spend more die space just to implement multiple redundant hardware paths to have optimal performance in all of the different APIs ...

Perhaps but that’s not applicable to AMD is it? No implementation of RT is fast on their current hardware so they don’t really get to have an opinion…yet.
 
In the end it’s performance that matters. If developers want the freedom to write ubershaders and optimize for coherency themselves they have to prove they can make it fast on some/all hardware.
Software rasterisation in UE5 is a nice wake up call. When you go really big you can do something magical.

Just because you write an inline RT shader it doesn't mean you can't write stuff to memory.
 
In the end it’s performance that matters. If developers want the freedom to write ubershaders and optimize for coherency themselves they have to prove they can make it fast on some/all hardware. If the result is that it’s slow on all hardware just to even the playing field that’s not a win for us.

I think if APIs (geometry shaders, tessellation, & etc) don't fit a developers usage patterns or their requirements then they'll fall out of use in spite of such possible performance benefits. Developers could opt in to use pixel shaders over compute shaders for their lighting pass in deferred renderers to take advantage of their optimal tiling access patterns, render target compression, or the ability to use hardware VRS but reality chooses to show a different trend. Nanite doesn't use mesh shading to render it's micropolygon meshes and in fact uses the compute shader despite not having the rasterizer at it's disposal. Just because current usage of APIs isn't optimal now doesn't means that graphics programmers won't be able to invent other ways in the future to show us otherwise ...

Perhaps but that’s not applicable to AMD is it? No implementation of RT is fast on their current hardware so they don’t really get to have an opinion…yet.

It's too early to say either of which API will win out in terms of popularity among developers ...
 
Theres no troubles for AMD, GPUs where scarce (availebility problems, mining etc), they could catch up to Intel and nvidia with dedicated hw acceleration for ray tracing with rdna3+. Its the consoles that missed the boat completely.
 
It doesn't?

It may use it to render meshes with larger triangle size (basically for the same purpose they use primitive shaders on PS5 with nanite) but it's meshes with smaller triangle sizes are firmly software rasterized with compute shaders ...

Epic Games are also working on a "programmable raster" feature which allows nanite to support even more types content and chances are it's even more compute shaders ...

There are all sorts of reasons APIs become infeasible overtime outside of just performance. Content ? Awkward usage patterns ? Other requirements ? Etc ...

It's just too early to predict exactly how the future is going to look because developers only have the faintest idea of what they're going to do and for us even less so. Therefore, it's too early for any one IHV to cater their hardware designs around a specific API ...
 
It's just too early to predict exactly how the future is going to look because developers only have the faintest idea of what they're going to do and for us even less so. Therefore, it's too early for any one IHV to cater their hardware designs around a specific API ...

I think its abit like the pixel/vertex/hw t&l days. Some things are better done in (dedicated) hw, like decompression, media encoding (apple proress, av1 etc), untill 'normal' hw becomes fast enough. Now i think things can be done in hybrid, but hw will have its advantages to have along.
 
It may use it to render meshes with larger triangle size (basically for the same purpose they use primitive shaders on PS5 with nanite) but it's meshes with smaller triangle sizes are firmly software rasterized with compute shaders ...
So it does then? I feel that the s/w rasterization portion here doesn't have much to do with how the triangles themselves are handled / rendered, and the latter can be done with mesh shaders, especially as mesh shaders are just a relatively minor optimization of compute shaders in relation to geometry processing.
 
So it does then? I feel that the s/w rasterization portion here doesn't have much to do with how the triangles themselves are handled / rendered, and the latter can be done with mesh shaders, especially as mesh shaders are just a relatively minor optimization of compute shaders in relation to geometry processing.

They could very well use mesh shading to render these dense micropolys but Epic Games doesn't do this because they found out this type of content is not a good fit for the HW rasterizer so they use compute shaders instead ...

How can one absolutely know what developers are going to do with ray tracing in the future and how that'll affect each IHVs HW implementation of ray tracing ?
 
They could very well use mesh shading to render these dense micropolys but Epic Games doesn't do this because they found out this type of content is not a good fit for the HW rasterizer so they use compute shaders instead ...
But can you use mesh shaders with s/w rasterizer? I don't see why it can't be possible. No idea if current UE5 build is using them in any way though.

How can one absolutely know what developers are going to do with ray tracing in the future and how that'll affect each IHVs HW implementation of ray tracing ?
That's a different question and I feel that it's not entirely up to what developers are going to do but also up to what GPU h/w vendors will allow them to do. It's not a decision of s/w developers on how the GPU h/w will evolve.
Key point here is that RT h/w must provide a) performance and b) flexibility. Both of these are needed for RT to evolve, and looking at how DXR 1.0 did over the past 3 years it is pretty obvious that without it there would be about zero ray tracing in modern games - despite even the old DX11 h/w being flexible enough to allow it. Which proves that flexibility on its own isn't enough here.

So saying that developers will do something which the h/w doesn't expect them to do is nice and all and even may be true in some cases (likely very limited ones, like Dreams renderer or CryEngine's RT) but there's also a huge chance that it won't actually happen en mass and most developers will opt to use what IHVs are proposing to use.

Note that UE5 itself didn't use RT h/w at first and there was even that argument that "RT h/w is dead!" at that point. Turns out that it's not and UE5 is using it in its release form.
 
Last edited:
Software rasterisation in UE5 is a nice wake up call. When you go really big you can do something magical.

Just because you write an inline RT shader it doesn't mean you can't write stuff to memory.

Sure and if you’re planning to just write parameters to memory doing it inline seems like the natural fit. And for multi bounce RT you can do a sort pass in compute after each trace pass. Pure software coherency solution. But that requires writing and managing a lot of state yourself.

The point isn’t that inline is inherently bad. The point is that developers need to prove that they can beat the IHVs at the optimization game.

One interesting thing to note in this debate is that it doesn’t make any difference to traversal or intersection speed. So AMD needs to improve there no matter what.
 
But can you use mesh shaders with s/w rasterizer? I don't see why it can't be possible. No idea if current UE5 build is using them in any way though.

I think I have an idea of what you're asking and the answer is no because mesh shaders are still tied to the graphics pipeline because it's output is fixed entirely for HW rasterizer consumption ...

It'd be a different story if mesh shaders we're truly a part of the compute pipeline just exactly like compute shaders are ...
 
Status
Not open for further replies.
Back
Top