AMD: R7xx Speculation

Status
Not open for further replies.
Not every shader needs texture data

True, but the industry is shifting towards more programmability certainly, and the relatively recent introduction of VTF/R2VB indicates (to me at least) the growing importance of the ability to access texture data in the shader units.

the trend in the industry as of late as been an increase in the amount of math needed per pixel hence ATI's rather aggressive ALU:Tex ratio.

I disagree, at least somewhat. I know ATi has been saying this for years now, and it certainly paid off in the R5xx generation with the transition from the wholly ALU-bound R520 to the mostly fillrate-bound R580; however, the transition to R6xx and the resulting increase in shader processors (and theoretically shader performance) has not proved beneficial except in synthetic tests (3dmark 06 would be the primary culprit here).

Now, I'm not saying an increase in shader performance is not necessary with the introduction of each successive generation of hardware. What I am saying however is that ATi needs to rethink their ALU:TEX ratio targets and adjust texturing performance accordingly. They know this, and so does just about everyone that posts here :p Let's hope they have indeed reacted accordingly with RV770. The recently-rumored and most plausible specifications for RV770 in my mind being 480SPs and 32 TMUs, providing a 50% increase over the previous generation RV670/R600 in SPs and a 100% increase in TMUs seems quite reasonable to me given ATi's currently over-ambitious ALU:TEX ratio.

Not every shader needs texture data
You can write shaders that don't need any texture data whatsoever so in short an increase in ALU power does not necessitate an increase in texure filtering ability.[/QUOTE]

True again, but just because something can be done does not mean it should. Again I refer to VTF and R2VB, particularly the inclusion of both within DX spec.
 
True, but the industry is shifting towards more programmability certainly, and the relatively recent introduction of VTF/R2VB indicates (to me at least) the growing importance of the ability to access texture data in the shader units.



I disagree, at least somewhat. I know ATi has been saying this for years now, and it certainly paid off in the R5xx generation with the transition from the wholly ALU-bound R520 to the mostly fillrate-bound R580; however, the transition to R6xx and the resulting increase in shader processors (and theoretically shader performance) has not proved beneficial except in synthetic tests (3dmark 06 would be the primary culprit here).

Now, I'm not saying an increase in shader performance is not necessary with the introduction of each successive generation of hardware. What I am saying however is that ATi needs to rethink their ALU:TEX ratio targets and adjust texturing performance accordingly. They know this, and so does just about everyone that posts here :p Let's hope they have indeed reacted accordingly with RV770. The recently-rumored and most plausible specifications for RV770 in my mind being 480SPs and 32 TMUs, providing a 50% increase over the previous generation RV670/R600 in SPs and a 100% increase in TMUs seems quite reasonable to me given ATi's currently over-ambitious ALU:TEX ratio.

You can write shaders that don't need any texture data whatsoever so in short an increase in ALU power does not necessitate an increase in texure filtering ability.

True again, but just because something can be done does not mean it should. Again I refer to VTF and R2VB, particularly the inclusion of both within DX spec.

Yeah I probably should have worded my post differently. I did not mean to suggest that ATI's decision to have a high ALU:Tex ratio was a good one, after all I think everyone agrees that R600 should have had more TMUs. I was just trying to address sylne's question.

Arun, could you elaborate here please? Are you saying as a result of this that enabling AF should cause less of a performance hit as resolution scales, or are you saying that end-users shouldn't be applying AF @ high resolution because it causes the GPU to do unnecessary work?

I was also a little bit confused by this post. Arun could you explain in more detail? trilinear/AF != bilinear and I don't see how the texture filtering method is related to the screen resolution. I guess with higher resolutions you'll be filtering less texels per screen pixel.. is this what you meant?
 
Napoleon @ Chiphell(made some good leaks in past) means that RV770 was moved from Mid-May to early June to launch it with more attention on Computex, together with GDDR5 XT-version.

This only sort-of makes sense. I mean, what's the point in missing out on potential revenue if there's no problem preventing the launch of a product? I understand the reasoning behind this thought, but it doesn't quite give us the whole picture.

If the launch was indeed moved back 2-3 weeks, there has to be more to the story. AMD's shareholders wouldn't be too happy if they missed millions in revenue because someone thought it was a good idea to launch later than the product was ready, simply to capitalize on some "free" publicity.

Yeah I probably should have worded my post differently. I did not mean to suggest that ATI's decision to have a high ALU:Tex ratio was a good one, after all I think everyone agrees that R600 should have had more TMUs. I was just trying to address sylne's question.

Fair enough.

I was also a little bit confused by this post. Arun could you explain in more detail? trilinear/AF != bilinear and I don't see how the texture filtering method is related to the screen resolution. I guess with higher resolutions you'll be filtering less texels per screen pixel.. is this what you meant?

That sounds familiar, actually. I think I've heard others say as much here in the past.
 
True again, but just because something can be done does not mean it should. Again I refer to VTF and R2VB, particularly the inclusion of both within DX spec.
Bear in mind that these types of spec don't necessarily require filtered textures - in which case there are 2x the total texture units available on R6xx (i.e. there are 16 filtering units plus 16 point samplers on R600/RV670). Single component textures can also be fed into the shader are 4x the filtered rate as well (Fetch4/Gather4).
 
Bear in mind that these types of spec don't necessarily require filtered textures - in which case there are 2x the total texture units available on R6xx (i.e. there are 16 filtering units plus 16 point samplers on R600/RV670).

I wasn't aware of this. Thanks for the info, Wavey. I don't suppose you could provide us any clue as to how much of an impact this has on performance? Best case scenario would obviously provide a doubling of effective texturing performance, but I wonder just how often this is beneficial in real-world scenarios.
 
I wasn't aware of this. Thanks for the info, Wavey. I don't suppose you could provide us any clue as to how much of an impact this has on performance? Best case scenario would obviously provide a doubling of effective texturing performance, but I wonder just how often this is beneficial in real-world scenarios.

Probably not very often given the performance numbers we're seeing!
 
I suspected as much, but it's still technically interesting :p

This has been known since the R600s release. The diagrams included in Richard Huddy's architecture presentation showed the extra 4 texture adress processors per texture unit dedicated to non-filtered textures and most reviews of the time mentioned it, as well as some more recent ones focused on the 3870 or the 3870x2.
 
Thanks all, the back and forth between you guys helped a lot with my question. And I learned new acronyms (can one ever get enough of those?)

I guess the answer I was looking for was given by Dave when he mentioned textures (in my case, a texture resulting from a shader) can be read using point-sampling units instead of filtering units.

To resume the thread, and regarding the Computex launch post, I'd venture that maybe the driver team can use a couple more weeks to polish the release driver? Free publicity + better perf/IQ might be good enough a reason if they know Nvidia is not going to steal the limelight. At the same time, it may also help them stockpile a bit, so that they don't experience an 8800GT introduction...


Edit: typo
 
Thanks all, the back and forth between you guys helped a lot with my question. And I learned new acronyms (can one ever get enough of those?)

I guess the answer I was looking for was given by Dave when he mentioned textures (in my case, a texture resulting from a shader) can be read using point-sampling units instead of filtering units.

To resume the thread, and regarding the Computex launch post, I'd venture that maybe the driver team can use a couple more weeks to polish the release driver? Free publicity + better perf/IQ might be good enough a reason if they know Nvidia is not going to steal the limelight. At the same time, it may also help them stockpile a bit, so that they don't experience an 8800GT introduction...

Edit: typo

I tend to agree with you too. Too me, at this moment, it seems like too many leaks on the NV side rather than AMD/ATi info about these new cards. Quite ood that the lastest mechanical drawing of GT200 leaked made its way to the net but not anything about the real piece of hardware :cool:

The battle this round would be rather interesting even if RV770 may be not as fast up to the point of GT200. Hope we will hear or see something very soon since this is the 2nd week of May already :devilish:
 
Why the sudden change to GDDR3 instead of GDDR4 (RV770 Pro)? Cost-cutting measure? Just doesn´t make any sense to me. Latency? What else could be the reason?
 
=>Sunrise: Sudden? There've been rumours about the RV770 using GDDR3 (Pro) and GDDR5 (XT) for some time now. And yes, GDDR4 are slower at the same freq as GDDR3, and since 2,5GHz GDDR3 are available...
 
...and since 2,5GHz GDDR3 are available...
The fastest sold parts from Samsung I could find were specced at a latency of 0.8ns, which then result in 1.100MHz (2.200MHz). Can you link me to those parts that have =>1.200MHz (2.400MHz)? Also, everything that is specced even higher should also be a lot more expensive.

I´m asking all of this, because some things that normally would make sense do sound very fishy to me.

If the supposed RV770 Pro really has GDDR3 with a 256bit interface (512bit internal) and we take 1.100MHz (2.200MHz) GDDR3 for example, we get a memory bandwidth of 70.4GB/s.

If the supposed RV770 XT has GDDR5, same interface width like the RV770 Pro and "only" comes with relatively conservative 2.000 MHz (4.000MHz) GDDR5, this would result in a whopping 128GB/s.

Now, why the big difference? The answer may be pretty simple, e.x.:

a. the Pro is very bandwidth constrained or
b. the XT has excessive bandwidth that isn´t really needed or
c. GDDR5 just runs cooler and they have a higher headroom to clock the ASIC to still meet their TDP.

What´s your take on that? Am I reading too much into it?

PS: This question (starting with the second paragraph) is not intended solely for Lukfi.
 
Nordic Hardware mentions the 11th of June as a possible launchdate for the RV770... I'm hearing either the 16th or 18th for the Pro and one week later for the XT (with GDDR5).
 
There are certain rendering operations that consume much more bandwidth than others. ~70GB/s would be fine for normal rendering but start lagging when AA/AF gets cranked.
 
Status
Not open for further replies.
Back
Top