Yes, the Raster Unit is determining the peak geometry rate, but I am discussing the change in fragment output per triangle from 1-16 for RDNA1 to...
Are you using 'rasterisers' and 'scan converters' interchangeably? Given a Shader Engine, there are 2 scan converters in 2 Shader Arrays, and the...
There seems to be definition vagueness. We know that the Raster Unit spits out 1-32 fragments per triangle. There are 2 scan converters involved...
https://forum.beyond3d.com/threads/amd-radeon-rdna2-navi-rx-6800-6800-xt-6900-xt-2020-10-28.62091/page-53#post-2176773 A triangle can touch a...
PS5 has 64 ROPs (Github), so its pixel fillrate has increased from previous gen and is plentiful: 64x2.23GHz = 142.7 Gpix/s Triangle throughput:...
Yes, you can easily get lost in all the nomenclature. I've clarified this before and will reiterate. The driver leak, Hotchips and Navi21 block...
For Navi21, my explanation for this weirdness lies with the 2 Scan Converters working on 1 triangle. A coarse and a fine scan converter, which we...
No problem, thanks. https://forum.beyond3d.com/posts/2177723/ Above is discussion using 2 scan converters per raster unit, a coarse one and a...
Navi22 disagrees. It's a mid-range 40CU, highly clocked GPU with a narrower bus than the PS5. Console GPUs are comparable to mid-range GPUs at...
Yep, your thoughts here were pretty much my thoughts when I first saw the driver leaks last month, and didn't think much of them and brushed them...
Well, CUs do SIMD and Scalar computation and are fully programmable. The Ray Accelerator is a fixed function block and acts as the Intersection...
I posted details a few pages back. Here's my post, where I started discussing: https://forum.beyond3d.com/posts/2178977/ Poster, @Digidi...
No, I don't agree. This thread has in its title "speculation". I haven't seen any discussion around the driver leak and relevant patents (mostly...
The final numbers are correct, however, the formula is incorrect. ROPs aren't used to calculate CU floating point capabilities as they are...
It is not to say that the location of the Raster Unit is better or worse, but the change from RDNA1 to RDNA2 is better. And by better means RDNA2...
Agree to disagree.
Thank you!
I'm not ignoring it. We already discussed Southbridge IO and SSD IO. We already used XSX IO and its SSD IO as a basis because consoles strip out...
I would then avoid using the IC term then because it has a particular meaning with RDNA2 PC GPUs. With PS5, the only block that we know is in the...
I found these patents and posted them on Resetera a while ago:...
Yes, so one must make the judgment if ones personal time is worth the effort to 'fix' what is deemed unworthy. Or let it spiral.
Nonspecific commentary does not help anyone if you need clarification.
IC is just a marketing term for SRAM. We have not been discussing performance or cost, rather if a hypothetical die can hold x amount of SRAM....
Yes, I can agree with just waves per SIMD doesn't show a direct link - that's why I mentioned multiple elements suggesting Packers are post scan...
No doubt, IC is a bandwidth saving feature. It will eat up die space, and its cost effectiveness with existing cache management on PS5 isn't...
The driver leak has changes to SIMD waves, which combined with the slide about RB+ and Packers connected to Scan Converters in the driver leak as...
I don't think Cerny cared for IC (a marketing term), rather he designed the PS5 for low latency and data throughput. The IO Complex is the heart...
What's the latest count now - 333 sq mm with 64MB IC? See below for IC SRAM density. If you remove 43 sq mm for 64MB IC, PS5 die is around 290 sq...
I've already done a comprehensive breakdown using XSX as a baseline. It's interesting to see if the same conclusion is reached with an alternative...
If we say PS5s die is 305 sq mm, my estimations leave 15-18 sq mm for: - SSD IO Complex logic (additional from XSX SSD IO logic) - SSD IO SRAM -...
This 5 sq mm is enough difference to cover SSD IO logic between XSX and PS5? Remember, PS5 IO Complex has: - SRAM, undisclosed amount - Kraken...
I asked a few times earlier in the thread, but I didn't get clarification. Recall that there are 4 Packers per Scan Converter, so Navi21 has 32...
Doesn't seem unbalanced to me. It looks like each Raster Unit has 2 scan converters. Each RDNA1 scan converter rasterised a triangle with 16...
I've seen the slide where they show 4 Prim Units for Navi21, 1 Prim Unit per Shader Engine rather than Shader Array. Which slide talks about...
This would be a good question. I would also ask if the culling rate is still 2 triangles per clock per Prim Unit, or is it higher now? BTW, we...
I see, you are looking to make adjustments. Do you have a high resolution die shot for Navi21 - ideally with its IO labelled like the XSX die shot...
See my edit. Your 30 sq mm is enough to cover 4 PHYs and 4 MCs from my estimations. Are these posters on Resetera? I did a huge breakdown there a...
MCs and PHYs are not the same. You still need to add 4 PHYs to your additional 30 sq mm. I'm referring to the PS5s SSD IO Complex, which is the...
Doesn't work by halving Navi21s die; you are not accounting for PS5s APU with these: - need to add back 4 memory controllers as 8 is still...
I'm thinking some algorithm with 2 scan converters that can be overlapped and their difference is a smaller triangle, and as small as 1 fragment...
6x6 is 36 pixels - you mean 32 pixels? Maybe 2 scan converters are involved per raster unit?
Well, @3dcgi mentioned Navi21 rasterises 4 triangles per clock, so how do you get that with 8 scan converters?
8 scan converters doesn't match 8 raster units, as there are 4 on the diagram, so 4 scan converters. Since there are 4 SEs for Navi21,...
What about RTX cores - they're MIMD? Makes me wonder why AMD didn't go with something similar - patent issues?
Thanks for the info. The driver leak and block diagram can look conflicting. This upgraded Raster Unit sounds really nice, and more efficient at...
I'm rather confused here... For Navi21, are you saying there are 4 Rasterisers (Scan Converters), and 1 Rasteriser per Shader Engine? Navi21...
From the driver breakdown, please can someone clarify these entries: num_sc_per_sh 1 1 1 1 1 1 1 1 num_packer_per_sc 2 2 2 2 4 4 4 4 Is SC per...
There's nothing surprising in these results. Most of the needed information is out there in the public domain, and most didn't analyse it enough,...
This hardware IO is easier to implement in a fixed console design. But for PCs, which hardware block is going to pay for the transistor budget?...
What are the chances that the hardware I/O blocks get implemented into future SSDs, and GPU drivers leverage this?