AMD RDNA3 Specifications Discussion Thread

Surely amd and others must see that working smarter is the only way forward. Even if we manage to scale smaller several nodes below 1nm, you approach an insurmountable problem of cooling. We are talking about generating power at watts per cm^2 that will be greater than nuclear power output.
What's the research and development of graphene based carbon nanotube transistors at? There used to be at least once a year some usual buzz articles about that but has been a long while since i'v read anything about that field, probabaly a decade or more away if ever? :(
 
Here's a summary of where many of us are at. And it isn't anti-RT.
  • We all want RT. Full Stop.
  • We don't think the hardware is quite there yet for completely convincing RT without significant compromises in other areas that some of us find worse than what currentRT implementations bring.
    • Which again, doesn't mean we are against RT.
    • We want RT.
  • Current hardware is a step towards that goal that is needed and companies bringing the hardware to us regardless of whether or not it's NV, Intel or AMD and regardless of how performant the hardware, it's still appreciated.
    • Which again, doesn't mean all of us are happy with the compromises involved for current RT implementations.
    • And again, doesn't mean we're against RT. :p
  • We, well most of us probably, are happy that there are people that find current RT implementations worth the price of admission.
    • That's good, because otherwise RT risks dying a short death.
    • Thus having people that find current implementations well worth the compromises involved is needed for the market.
      • What isn't needed is having some proselytize to the point where it almost feels like Jehovah's witness saying that you're damned if you don't agree with our collective and embrace your lord and savior RT into your heart. :p
    • Keep in mind, it isn't just "AMD fans" that aren't happy with current RT implementations, owners of NV cards as well are not always convinced and many run games with RT disabled even on their RTX 3xxx cards.
      • Again, it must be stated that all of us (or at the least the vast majority) can't wait until RT gets to the point where it's actually worth it to have it on whenever the option for it is presented.
      • What X person finds acceptable isn't necessarily what Y person finds acceptable.
  • Hell, even developers that are fully on the RT train with attempting to implement RT into their games aren't entirely happy with the state of RT hardware and where it's at.
    • That's why they question design choices and constantly talk about how the hardware could be better if they had more control over how aspects of RT is handled (the BVH tree, for example).
People should really disabuse themselves of the idea that people are against RT or hate RT just because they don't find current hardware accelerated RT to be worth enabling in every title that has hardware accelerated RT.

Regards,
SB
Give this man a f***ing medal, and please pin this comment on the forum or in every AMD/Nvidia RT thread.

Every interesting discussion on B3D descends into the same old tired recitation between the same three RT acolytes and their supposed crusade against the non-RT infidels. It’s exhausting and nauseating.

Thank you SB for restoring my sanity and faith in B3D. :mrgreen:
 
still not seeing how this is a failure point. To me, and this is no disrespect to any developer, things have to mature to the point where you are allowed to get full control over the pipeline.
It took forever to get to compute and we're still just on the edge of it, but how many years was this really in development for. RT really just started. I think people are asking for too much too soon.
I think we need to ask the obvious question of whether or not, a programmable RT pipeline would have had performance, because that's like saying all of our initial GPUs should have gone with compute from the get go and there should not have been fixed function hardware, and I don't think that's true.
It's an evolution that requires innovations on both the software and hardware side of things.

Came to say this. I don’t get the DXR 1.x criticism. Clearly a more flexible first iteration of the api would have required more space on the die. Where would these extra transistors come from? Also it seems to rely on an assumption that smart developers using more flexible hardware can beat dedicated hardware at its own game with the same transistor budget. This has never been the case. Flexibility always comes at a cost.

The way I see it people are complaining that the first version of something isn’t more perfect. Yet they never explain how they would pay for their ideal first attempt.
 
Came to say this. I don’t get the DXR 1.x criticism. Clearly a more flexible first iteration of the api would have required more space on the die. Where would these extra transistors come from? Also it seems to rely on an assumption that smart developers using more flexible hardware can beat dedicated hardware at its own game with the same transistor budget. This has never been the case. Flexibility always comes at a cost.

The way I see it people are complaining that the first version of something isn’t more perfect. Yet they never explain how they would pay for their ideal first attempt.
What makes you so sure DXR currently exposes or is able to utilize all the capabilities hardware has?
 
Where did I say that? Clearly the hardware has more functionality than DXR exposes. That has been the case with DirectX forever. Hence IHV extensions.
In the part where you said more flexible first iteration of the API would have required more transistors? At least that's the way it reads to me.
 
The difference compute was not existing when first GPU release now it exists and we just have probably the biggest compute innovation with Nanite and not enough flexibility is a problem but at least they solve the problem with software lumen.
But we had cpu rendering before that which was software rendering. It was slow but it could do anything but it was slow. And then 3D accelerators came along and they were limited, but they were fast. And after years of evolution and some mistakes along the way, now they are software and they are fast.

RT is slow. It’s going to be a similar process. One of these days it will be software and it will be fast, but right now it is faster and cheaper the way it is currently setup.

I’m sure they have reasons other than to frustrate developers
 
What's the research and development of graphene based carbon nanotube transistors at? There used to be at least once a year some usual buzz articles about that but has been a long while since i'v read anything about that field, probabaly a decade or more away if ever? :(
I’ve not heard of any advancement there yet. Any type of graphene processor would have been big news regardless of size.
 
In the part where you said more flexible first iteration of the API would have required more transistors? At least that's the way it reads to me.

Those extra things certainly cost more transistors than not having them. The point is that the stuff folks are complaining about like lack of LOD and flexible BVH formats would cost a tremendous amount more. They’re not free.
 
The first exclusive current generation release with the Frostbite engine doesn't even feature ray tracing ...

Could this be a glimpse of the near future ? If so then the industry must be better hands than we thought ...
 
With the big gap between N32 and N33 are they going to get their money's worth and use N32 for 7700-7800XT, maybe 38-44-52-60? Seems like the only thing they can do
RT is slow. It’s going to be a similar process. One of these days it will be software and it will be fast, but right now it is faster and cheaper the way it is currently setup.
Yes it's going to take years and many iterations of hardware, software, tools, worklows etc to adjust, improve, for people to get experience and really optimise things before it matures. Right now we're comparing 4 years of RTRT in games on 1st-3rd gen hardware vs decades of rasterisation optimisations and wouldn't you know it RTRT is far from perfect. 5th/6th gen RT hardware and new consoles hopefully with 10x RT perf built on a decade of knowledge will still be far from perfect but it'll be a massive improvement I'm sure
 
Those extra things certainly cost more transistors than not having them. The point is that the stuff folks are complaining about like lack of LOD and flexible BVH formats would cost a tremendous amount more. They’re not free.
Of course not, but since current hardware seems to be more flexible than the API we have (as you pointed out extensions are out there), why the assumption that more flexible API would have required more transistors budget than they've already used on things DXR doesn't cover?
I'm not a dev or expert or anything, but aren't for example methods described here bringing LOD to RT, which you suggested would require tremendous amounts of more transistors? https://gpuopen.com/learn/why-multi-resolution-geometric-representation-bvh-ray-tracing/
 
Came to say this. I don’t get the DXR 1.x criticism. Clearly a more flexible first iteration of the api would have required more space on the die.
Damn, no!
This is the key problem here: Nobody gets the criticism is all about software API and resulting abstractions of important data structures.
Fixing it would not need to change the HW, so it would not need more die space. The fix would (or must) work for all existing and future RT HW.
Nobody says we need programmable traversal to implement stochastic LOD, packet traversal, etc.

If you still don't get the reasons for the critique, i wont explain it again anymore. I can't help it.
But i keep repeating that the requested changes are independent from HW.
It's just that - the longer they wait with the fix, the harder it becomes, because newer HW and more vendors means more acceleration data structures, reducing the options on how to implement the fix.
Where would these extra transistors come from? Also it seems to rely on an assumption that smart developers using more flexible hardware can beat dedicated hardware at its own game with the same transistor budget. This has never been the case. Flexibility always comes at a cost.
No extra transistors. (Actually less, because we can turn the expensive background task of BVH building into the negligible background task of BVH streaming and conversation)
More flexible HW isn't requested or needed.
The only cost of flexibility is driver development becoming more costly.

The fix is ideally a common interface to access the BVH data structure. Just Software.
If it's already too late for that (i don't think so), we need vendor extensions, handling each architecture individually. Just Software.
The data structures are custom and can't be changed. This isn't requested. We just need an interface or specifications for those data structures, so we can build it on our side, instead the driver doing black magic.
Those who don't need the extra flexibility can still use the driver - nothing is lost. Now downsides to anybody.

I don't see how such request confronts any religions or glorifications of certain rendering techniques, HW vendors, or HW vs. SW preferences.
It doesn't, so anybody can be fine with fixing those broken things, no matter who's his shiny god or sponsor. Damn it.
Even the people who do understand the problem still don't get this: The problem is software alone, not hardware.
 
But we had cpu rendering before that which was software rendering. It was slow but it could do anything but it was slow. And then 3D accelerators came along and they were limited, but they were fast. And after years of evolution and some mistakes along the way, now they are software and they are fast.

RT is slow. It’s going to be a similar process.
No, it's not the same process.
The process of making GPUs programmable required (huge) changes on the HW. So it couldn't have happened over night, even if they knew it's going to come.

Comparing the flexibility issues of DXR with this long years progress is just as pointless as comparing rasterization with ray tracing. It's apples vs. oranges. You all need to get past that point.

I’m sure they have reasons other than to frustrate developers
It was rushed, and decisions were short sighted.
Idk, what's the reasons for that, but shit happens.
Intel has to destroy AVX-256 on a whole architecture of chips. How could this happen? Why didn't they exclude the feature from the start, to save the die space and money?
Idk either, but it's quite the same shit which has happened, although they are experts and should have known what they do.
 
It was rushed, and decisions were short sighted.
Are we sure it's an API restriction?
Does Vulcan give you access?
As NV may have wanted to not expose it for whatever reason.

MS obviously see's a benefit otherwise they wouldn't have given more access on consoles (which are obviously RDNA based)
But no point giving that on PC if not supported by NV at the moment.
 
I can't believe people had a huge back and forth argument about whether or not people hated RT or didn't want it.

I would think everyone wants new tech, but don't think it's fully there yet to commit to fully. And want to wait for tools/software and hardware to all grow enough to fully have it be a mostly no compromised experience.

People were mad when ue4 got rid of svogi all those years ago because it would have been too taxing for the consoles to use in that form. But if they had kept it in it would have been untenable for the majority of hardware to begin with not just due to the consoles which would not have been good for anyone. It was something that had to happen. now look. A gen later and ue5 has a software solution in lumen that is even better than svogi and makes it look obsolete. And they even have support for hw RT on top of that which looks even better.

My main point is being pragmatic is good in times like this
 
Of course not, but since current hardware seems to be more flexible than the API we have (as you pointed out extensions are out there), why the assumption that more flexible API would have required more transistors budget than they've already used on things DXR doesn't cover?
I'm not a dev or expert or anything, but aren't for example methods described here bringing LOD to RT, which you suggested would require tremendous amounts of more transistors? https://gpuopen.com/learn/why-multi-resolution-geometric-representation-bvh-ray-tracing/

That paper is interesting. It proposes an extension to AMDs traversal implementation which is theoretically possible because it’s already done in software. However it also requires changes to AMDs BVH intersect implementation which isn’t implemented in software. So it does in fact need new hardware.

The approach depends on additional precomputed values in each BVH node which means no support for animation or higher BVH update cost. It also requires the BVH intersection hardware to do more calculations and return more data to the traversal shader. The gist is that instead of traversing deeper into the tree you can decide to stochastically sample just one of the triangles underneath the current node and use that as an approximation for the pixel covered by that node.

They use an OpenCL implementation to prove the concept but there are a few challenges aside from the need for new hardware. Naive random sampling of a single triangle within a node can lead to extreme error depending on the diversity of materials and geometry in that level of the BVH.
 
The fix is ideally a common interface to access the BVH data structure. Just Software.
If it's already too late for that (i don't think so), we need vendor extensions, handling each architecture individually. Just Software.
The data structures are custom and can't be changed. This isn't requested. We just need an interface or specifications for those data structures, so we can build it on our side, instead the driver doing black magic.
Those who don't need the extra flexibility can still use the driver - nothing is lost. Now downsides to anybody.

Do you have specific examples in mind of useful api extensions? The DXR interface is essentially “build an acceleration structure with this bag of triangles”. It does not mandate that the structure is a BVH or anything else. This gives maximum flexibility to the IHV and more room to innovate rapidly on the hardware side. The downside is that it’s completely opaque to developers.

We’ve had this debate before. In order to provide developers with more access Microsoft would have to dictate the data structure that all IHVs must use. This is not a guaranteed win. Just like DirectX evolved over time to gradually expose more flexibility so will DXR. History dictates that giving developers free reign from day one is probably not a good idea anyway. DX12 is clear evidence of that.
 
Back
Top