Upcoming ATI Radeon GPUs (45/40nm)

It was amazingly quick for vertex processing, but had serious problems with some heavy duty pixel shaders. It has led me to believe that those 320 SPs weren't all that amazing compared to NV's 128 SPs. Aside from the differences in each company's way of counting (ATI actually having 64 SPs when you don't just count ALUs.)
From the article you linked:
The first pixel shader test will be the Fur test. When used with the lowest settings, it uses 15-30 texture lookups from bump maps and two lookups from the main texture. The High Effect Detail mode increases the number of lookups to 40-80. When shader supersampling is enabled - the number of lookups grows to 60-120. And the High mode with SSAA is the heaviest mode - 160-320 lookups from a bump map.
Looks pretty heavily texture bound and the results agree.
 
From the article you linked:

Looks pretty heavily texture bound and the results agree.
well what happens then in this "PS4 Fire" test? The results here are the most extreme. Digit Life suggests it is a driver bug.
http://www.digit-life.com/articles3/video/rv670-part2-page1.html
The second shader test is called Fire, it's even harder for ALUs. It contains only a single texture lookup, while the number of sin/cos instructions is doubled to 130.
image1zw6.jpg


It's also interesting how well ATI GPUs perform for vertex processing. Must be that poor NV triangle setup rate showing up huh? Regardless though, RV670's "320 SPs" don't really show huge advantages over NV's "128 SPs". The apparent huge bottleneck they had from texture lookups was embarassing, and results like this Fire test just plain strange.

They havent run this suite of synth tests on RV770 yet.
 
Last edited by a moderator:
R420 did address R300's relative lack of shading compute power
Lack relative to what exactly? :oops:

R300 (8 PS2.0 pipes at 1:1 alu:tex) was up against NV25 (4 PS1.0 pipes at 1:2) and NV30/5 (8 PS1.1/4 PS2.0 pipes at 1:1?).
It was the shader compute beast of the time with no peer.

-R420 took the same ratio & doubled output.
-R520 took the same ratio & improved efficiency + added SM3.0.
-R580 trippled ALU:TEX with everything else much the same as R520.
-R600 bumped up ALU:TEX & unified VS/PS + added DX10.
-RV670 took the same ratio & shrunk + added DX10.1.
-RV770 took the same ratio with 250% output + efficiency improvements.

Arguably those RV770 efficiency improvements to the TEX have changed the ratio so that is effectively lower than R6x0?

Certainly I think R600 was over-ambitious going for 4:1, it fit the ring-bus architecture nicely though & I think thats why they went for it.
In GT200, NV finally moved up to 3:1 ALU:TEX having sat at 2:1 since NV40/G70.

I expect RV870 to be a rather smaller boost over RV770, keeping the 4:1 ratio & only at most doubling output.
Die size as much smaller than RV770 as they can manage with 256bit.
(192bit with 750MB very high speed GDDR5?)

RV770 over-achieved in unit shrinking to the extent that they had to add 1/5th the final FLOPS/TEX just to fill space.
It came at the expense of a bigger die size/transistor count & seemingly power consumption so I think ATI will want to bring those relatively down a bit in the next generation.
 
It came at the expense of a bigger die size/transistor count & seemingly power consumption so I think ATI will want to bring those relatively down a bit in the next generation.
Well, hopefully, PowerPlay is just disabled. They touted improvements to it and one would hope that means it isn't as bad as it looks right now. ;)
 

I don't know myself, SH argument seems silly. "But Wavey AMD increased their texture power by 250%!!!!111". Yeah they also increased their shading power by 250% too. Is SH suggesting the R6x0 was "severely short" on shading power?
 
I don't know myself, SH argument seems silly. "But Wavey AMD increased their texture power by 250%!!!!111". Yeah they also increased their shading power by 250% too. Is SH suggesting the R6x0 was "severely short" on shading power?
DUH!
R600 was severely short on texturing capability, so short that it became bottleneck, so the relatively good shader-speed was unobservant.
Invcreasing Tex power the bottleneck was removed - so now shading power can be better utilized.
And fixed z-speed & AA/AF.
Thats what SH is implying imho.
Anyone claiming R600 has "enough" tex power should take a look at ATi's market&mind share.
 
well what happens then in this "PS4 Fire" test? The results here are the most extreme. Digit Life suggests it is a driver bug.

The apparent huge bottleneck they had from texture lookups was embarassing, and results like this Fire test just plain strange.
No. It was a driver bug and i was resolved in Catalyst 7.12:

On a quick note for RV670 performance update:

D3DRightMark 2.0 Fire PS4.0 Test

7.11 & older -- 3.2 fps;
7.12 -- 186 fps;

:D

Using this driver, HD3870 ended faster than GF8800GT in this particular test.
 
For a while, I'm quite convinced that the most important bottleneck (in gaming, at least) of the RV670 was not texturing, but the utilisation of ALUs - shader codes being too "linear" for good ILP extraction. Dave, could you comment on this one?
 
DUH!
R600 was severely short on texturing capability, so short that it became bottleneck, so the relatively good shader-speed was unobservant.
Invcreasing Tex power the bottleneck was removed - so now shading power can be better utilized.
And fixed z-speed & AA/AF.
Thats what SH is implying imho.
Anyone claiming R600 has "enough" tex power should take a look at ATi's market&mind share.

Is SH suggesting the R6x0 was "severely short" on shading power?

Regardless of how underpowered one aspect of it was, if everything is basically increased in a 1:1 fashion, nothing is actually being "fixed". While it may be coincidence that texturing power is now much more robust, it was not singled out as a main source of "fixing" because both it, and the shading power, has increased at the same rate. I do not understand how that can be so easily construed as fixing.

It would be like saying that the r700 is somehow fixing something in the rv770 because it is now 100% more of both shading and texturing elements.
 
Last edited by a moderator:
Back
Top