I now believe that AMD custom is a lot more detailed than first thought.
Always thought it was mainly big blocks, media blocks, rops, front end, cache sizes, then specific customer customizations.
Now I think it's down to features also.
VRS, int 4/8 etc.
So it's not about removing features from RDNA2 it's about paying to have them included from list of available features.
So then you have to consider Sony and MS possible different reasoning.
Example reduced precision:
Sony
- What use will it have in games
- Cost of inclusion
- Spend that money on other features (cache scrubbers etc)
- Can be done without reduced precision for the odd situation where would be nice to have.
MS
- Hardly any die footprint
- Light Azure ML workloads when not running xcloud
Back when these was being designed DLSS wasn't a thing, etc.
Much like Nvidia it's up to MS to now make use of features that may be more unique.
ML texture de/compression, ML upscaling for example.
These are all things than can be done on all platforms but may have a decent performance benefit on xbox. And can shared across internal studios and also be included in playfab. Same for AVX libraries.
RDNA 1 has dot product hardware? How do u use it? There is no mention of it in the RDNA 1 isa from what I can tell and the RDNA 2 isa lists added dot product instructions as a notable feature change for RDNA2.
{"v_dot2_f32_f16", GCNENC_VOP3P, GCN_STDMODE, 19, ARCH_NAVI_DL },
{"v_dot2_i32_i16", GCNENC_VOP3P, GCN_STDMODE, 20, ARCH_NAVI_DL },
{"v_dot2_u32_u16", GCNENC_VOP3P, GCN_STDMODE, 21, ARCH_NAVI_DL },
{ "v_dot4_i32_i8", GCNENC_VOP3P, GCN_STDMODE, 22, ARCH_NAVI_DL },
{ "v_dot4_u32_u8", GCNENC_VOP3P, GCN_STDMODE, 23, ARCH_NAVI_DL },
{ "v_dot8_i32_i4", GCNENC_VOP3P, GCN_STDMODE, 24, ARCH_NAVI_DL },
{ "v_dot8_u32_u4", GCNENC_VOP3P, GCN_STDMODE, 25, ARCH_NAVI_DL
Agreed.It seems a bit much for AMD to go through the process of redesigning hardware blocks just to remove or add features especially ones that cost very little in terms of silicon. Just disable them in the bios.
FidelityFX is just the latest name for AMD's set of optimization libraries, it's been around at least since mid-2019 and it's been used on dozens of games already.Is there an ETA on this tech's implementation?
It doesn't need to be a redesign.Maybe AMD sells features like BMW. Your hardware comes with all the bells and whistles but AMD won't expose them through software unless you pay for it (you can actually buy features through your BMW dash, WTF). LOL.
It seems a bit much for AMD to go through the process of redesigning hardware blocks just to remove or add features especially ones that cost very little in terms of silicon. Just disable them in the bios.
This is about the benefits of ML.Also, MS was talking super resolution using DirectML back during spring of 2018 so its not new to them.
https://devblogs.microsoft.com/directx/gaming-with-windows-ml/
This has been the strongest reason presented (for me) to believe its just on every gpu.But it makes no sense for Navi 14 - which is a medium-low range discrete GPU for gaming - to have dot4-INT8 + dot8-INT4 capabilities, as it's definitely not an important feature for its target market.
The ALUs themselves are hardware blocks as are the components in the CUs as well.This has been the strongest reason presented (for me) to believe its just on every gpu.
But for semi custom parts, if you don't think you need it and there's a cost, you may not take it.
I think I mis represented my thoughts before, making it possibly sound like I was suggesting a totally changed and different hardware.
AMD must have many different features so they can sell custom parts at different price points.
Even if it just means not exposing it at that hardware level.
I'm also not saying Sony doesn't have it, just that I could see why they wouldn't.
I'm unsure why they couldn't just not expose the lower precision functions though.The ALUs themselves are hardware blocks as are the components in the CUs as well.
So I would disagree that it’s in every Navi.
Semi custom is often just a selection of various component blocks: in this case the select of mixed precision dot products is selecting a different type of ALU block that supports it. It’s will run a higher cost of silicon to support multiple pathways
I’m pretty confident it’s a hardware pathway that requires silicon and a redesign of the ALU blocks to support it.I'm unsure why they couldn't just not expose the lower precision functions though.
We've seen them disable features at the hardware level many times for many different reasons.
Can't say I disagree with your or @ToTTenTranz reasoning. Just that either way I could see it not being exposed.
Yeah, I should have written Super Resolution.FidelityFX is just the latest name for AMD's set of optimization libraries, it's been around at least since mid-2019 and it's been used on dozens of games already.
If you're talking about Super Resolution / FSR then there's no date but Scott Herkelman said they're planning to "release it this year".
The problem here is we don't know if FSR "will be shown this year" or the first title supporting FSR is releasing this year. They could be two very different things.
The ALUs themselves are hardware blocks as are the components in the CUs as well.
So I would disagree that it’s in every Navi.
AMD is being awfully secretive about FSR, yes.Yeah, I should have written Super Resolution.
AMD has a page on FidelityFX "supported games", including upcoming games. But none of the ones listed include FSR as a planned feature.
That would be quite stupid but wouldn't be surprising. There's no way AMD is going to match Nvidia in combined performance. If AMD can at least boost perf via FSR to make RT more viable for AMD customers then it's great, at least for this gen.In the PC Gamer conversation I believe he said FSR would make nvidia's RT preformance advantage a "moot point" (though I don't know if he was taking DLSS into the equation).
In June 2019 (4 months before the release of RX5500 / Navi 14), these were added for the Navi architecture:
GFX1011 is Navi 12 (Apple's Radeon Pro 5600M) and GFX1012 is Navi 14 (RX/Pro 5500/M, RX/Pro 5300/M) . In LLVM they're both listed as supporting dot4 INT8 and dot8 INT4 instructions with a 32bit accumulator:
https://llvm.org/docs/AMDGPU/AMDGPUAsmGFX1011.html
So far, Navi 10 is the only Navi GPU without support for higher throughput forINT8/INT4.
In fact, save for the 7nm APUs using older Vega iGPUs, Navi 10 is the only GPU chip released by AMD since 2019 that does not support dot4 INT8 / dot8 INT4.
Even Vega 20 already had support for this.
If I had to guess, there were never any plans to release any RDNA GPU without INT8/4 RPM, as it seems to be intrinsic to the base architecture. It was just broken on the latest Navi 10 tapeouts and AMD chose not to delay the RX5700 release any longer (with rumors of Navi 10 being a headache to AMD giving some credence to this).
The "it's there as an option" theory would make sense if these were only available for e.g. Navi 12 on Apple computers, with that company pushing for better ML inference performance. But it makes no sense for Navi 14 - which is a medium-low range discrete GPU for gaming - to have dot4-INT8 + dot8-INT4 capabilities, as it's definitely not an important feature for its target market.
Agreed.
Which is again another reason why it would be very odd if AMD had just decided to redesign the RDNA1 CUs for Navi 12 + Navi 14 to have the hardware for faster ML inference, and then take it off on RDNA2 very specifically for the PS5, and then put it on again for the Microsoft RDNA2 consoles, and leaving it there again for all RDNA2 discrete GPUs.
Link?Why does Navi 14 have dot4 INT8 / dot8 INT4, then?
Why would AMD decide to include a ML inference feature on a low/mid-end GPU focused on gaming?
If this isn't an inherent capability of the RDNA WGPs, why include it in a low-margin / low-cost where die area is critical to achieve profitability?
Out of all RDNA GPUs that have launched so far and have their specifications publicly available (5 RDNA1/2 dGPUs + 2 Series SoCs), how many have this capability absent? There's Navi10 which was rumored to be very problematic for AMD, and...?
That would be quite stupid but wouldn't be surprising. There's no way AMD is going to match Nvidia in combined performance. If AMD can at least boost perf via FSR to make RT more viable for AMD customers then it's great, at least for this gen.
(...)
So yes, we are at a deficit [against Nvidia]. There's no doubt about it. I think when we come out with FSR that point will be mute, and that's why we're working on that so significantly now.
The only Pro SKUs using Navi 12 and Navi 14 are the ones going into the 16" Macbook Pro, which have zero mention of ML inference in their website as they're presented as GPUs for 3D content creation and video content.Probably because the majority of the skus related to Navi 12 and Navi 14 are pro cards. The fact that they are in 5300 and 5500 based cards may be due to a reality where AMD didn't want to create a separate Navi 13 (or whatever) to accommodate lower end design meant for consumer graphics.
There is no proof whatsoever pointing to the PS5 not having what seems to be a basic feature of almost all the RDNA GPUs (all save one), and very little reason for Sony to just take it outThe PS5 doesn't seem to have them which would be weird if they were an intrinsic part of the rdna design. How are they broken in the PS5 but not in XSX and if they aren't broken why would Sony chose not to expose them? They are basically free in that circumstance.
Link?
I'm not sure I've seen Dot4 and Dot8 mixed precision outputs on a standard Navi.
There's nothing in a whitepaper to suggest it does for standard Navis
When asked about raytracing performance competitiveness:
Like I said, quite the bullish statement he made here and you can see a somewhat proud smile when he says this..
I also find it hard to believe it can match Nvidia's RT+DLSS2, but it might level things up a bit.
I mean stupid to compare a possible RT+FSR solution against Nvidia RT only, without DLSS.Stupid to say or stupid to do?
The statement as I recalled it does seem really farfetched, but I just revisited it here, at ~42m20s:
Yea the navi 10 just going off the white paperWhat's "standard Navis"?
- For Navi 12 + Navi 14 you can follow the links in this post;
- For Navi 21 and Navi 22 it's well documented and reported;
- For the Series X SoC it's well documented and reported (on Microsoft slides);
- For the Series S SoC it's well documented and reported (on Microsoft slides).
If by "standard Navis" you mean "exclusively Navi 10", then there is no dot4/dot8 mixed precision output in there. Not functional, at least.
I'm not sure what makes Navi 10 more standard than any of the other two RDNA1 dGPUs or even the other Navi 2x GPUs out there, though.
Stupid to say or stupid to do?
The statement as I recalled it does seem really farfetched, but I just revisited it here, at ~42m20s:
When asked about raytracing performance competitiveness:
Like I said, quite the bullish statement he made here and you can see a somewhat proud smile when he says this..
I also find it hard to believe it can match Nvidia's RT+DLSS2, but it might level things up a bit.
The only Pro SKUs using Navi 12 and Navi 14 are the ones going into the 16" Macbook Pro, which have zero mention of ML inference in their website as they're presented as GPUs for 3D content creation and video content.
And if that capability is only interesting for Pro solutions, why include it in Navi 21 and Navi 22 that have no Pro counterparts at all?
Besides, is the 16" Macbook Pro really selling more than all the discrete cards + low-end system integrators?
There is no proof whatsoever pointing to the PS5 not having what seems to be a basic feature of almost all the RDNA GPUs (all save one), and very little reason for Sony to just take it out
Sony not mentioning it isn't proof of anything, nor is one guy stating "there is no ML stuff" in a tweet he was quick to delete afterwards.
I mean stupid to compare a possible RT+FSR solution against Nvidia RT only, without DLSS.