Upcoming ATI Radeon GPUs (45/40nm)

Highly doubtful. R8xx will be out next year. It would be miraculous if DX11 made an appearance prior to 2010.

It will make appearance next week, dunno when it will be released though. R800 has always been slated for DX11, the next NV part (g300 or whatever) will propably be DX11 too, the rumours are allready heading that direction.
 
When the ALU/Tex ratios stay the same how is there a "correction" there?

The issue here is not the ratio, but the raw capacity of those aspects.

Exactly.

It will make appearance next week, dunno when it will be released though. R800 has always been slated for DX11, the next NV part (g300 or whatever) will propably be DX11 too, the rumours are allready heading that direction.

I'm not saying you're wrong, but this is the first I've heard of that. Do you have any links to back this up?
 
So, by that logic R420 was a correction on R300:?: :!:

You are saying that prior generations must have been "bad" because they don't offer as much raw performance!
 
So, by that logic R420 was a correction on R300:?: :!:

You are saying that prior generations must have been "bad" because they don't offer as much raw performance!

I think you're over-simplying this, Wavey. R420 did address R300's relative lack of shading compute power, so in a sense it was a "correction". A better example of another correction would be R520->R580 addressing the same lack of shading power. Given the fact that ATi has addressed a shortcoming in shading compute power twice, I can understand why they shifted to a high ALU:TEX philosophy. Unfortunately, having been burned twice they over-corrected, which caused the relative lack of texturing ability in R6xx. R7xx is a correction of the failed R6xx design philosophy. I know you can't admit this as a representative of ATi, but there's no use denying it either.

RV770 is a fantastic GPU, no matter how you slice it.
 
RV770 has changes that could be construed as a significant rethinking of some of R600's big design decisions, such as the abandonment of the ring bus, changing how the TMUs relate to the ALUs, and a reworking of the cache hierarchy.

I wish there were an interview that covered some of the reasons for those changes.
 
No, thats not an oversimplification, thats exactly what you are saying.

This isn't a correction in design philosphy (wrt Tex/ALU ratios), because like R300->R420, it is using the same philosophy, it is just taking advantage of new processess and a whole crapload of engineering work to optimize the design in order to fit more units it. Thats not a correction, thats an extension.
 
ShaidarHaran: I can't accept your opinion. RV770 boosted texturing and aritmetic power equally. You can't say that RV770 adressed lack of texturing power, because RV770 boosted nuber of ALUs in the same way and if R600 wasn't lacking of anything, pure aritmetic rate was this thing.

In relative way, RV770 is weaker in texturing than R600, because ALU:TEX remained and TFUs were emasculated.
 
Hold on now, one generation of products isn't enough to pronounce a trend shift. I don't believe so, anyway. ATi has long held to the notion that compute power should increase with successive generations relative to texture filtering/sampling abilities. Why change now?
They seemed pretty explicit about holding to 4:1 in their explanations of the architecture. Don't forget that with the abandonment of the ring-bus the SIMD<->TU relationship has changed.

It's worth pondering whether it's wise (or reasonably feasible) to go beyond 16-wide SIMDs. If not then the only scaling options are to introduce an ALU-specific clock and/or do limited sharing of TUs by SIMDs (e.g. 2 SIMDs share a TU). Of course these options are very much like NVidia's design.

I believe R7xx is a "correction" to the mistake that was R6xx and it's horrible lack of texturing/z-fill/and AA sample rates. I'm sure you'd agree with me on this. Now that these mistakes have been corrected, there's no need to do so again. Thus, ATi can return to their preferred design philosophy with the R8xx generation of products if they are in a position to do so (and with the shrink to 40nm I can see no reason why they wouldn't).
ATI's primary mistake was the TEX:BW and Z:BW ratios. If you play with the configurations available (and try not to have a lot of SIMDs) then ATI had very little choice in the count of ALUs for each of the GPUs in the R6xx range.

Overall, though, I agree, with enough TEX and Z for the available bandwidth, lots of ALUs can do no harm. We still don't know how much bandwidth is going "spare" in HD4870 in current (or near term) games - i.e. we don't know how much extra TEX and Z would be welcome before swamping bandwidth. Of course GDDR5 is looking like it's going to ramp up to 5GHz or so over the next year.

The other side of the coin is that RVx70 GPUs should presumably stay fairly small - otherwise they get too costly. What sort of upper limit are we likely to see, given how big RV570 is?

Jawed
 
RV770 has changes that could be construed as a significant rethinking of some of R600's big design decisions, such as the abandonment of the ring bus, changing how the TMUs relate to the ALUs, and a reworking of the cache hierarchy.

The TMU's relation to the ALU's was changed so that we can keep the same ALU/Tex ratio and batch sizes while increasing the entire texture and shader engine. With R600's design increasing the ALU size would mean either adding more SIMD's, but that would retain the same number of textures (thus the ratio would bias more to ALU's), or adding more ALU's per SIMD (and similarly adding more textures to the texture array), but this would result in larger batch sizes carrying other penalties.

R600 was OK in this respect for its generation because it did allow configurable, 2D scaling of numbers of units for the rest of the family, but these all reduced the number of units, not increased. To increase the units the relationship between texture engines and SIMD's had to be changed to allow the same ratios and the same batch sizes.

Caching hierarchy and memory went hand in hand with one another. We put a lot of work into texture cache modeling and changed to a fully tiled memory system that, for the primary task of 3D operation, alleviated the need for the memory channels to be passing data between one another.
 
No, thats not an oversimplification, thats exactly what you are saying.

This isn't a correction in design philosphy (wrt Tex/ALU ratios), because like R300->R420, it is using the same philosophy, it is just taking advantage of new processess and a whole crapload of engineering work to optimize the design in order to fit more units it. Thats not a correction, thats an extension.

Dave, you can't bs a bs'er ;) Every enthusiast and their grandma knows ATi was severely short on texturing power in the R6xx generation. RV770 has addressed this. Am I wrong? Why then did your engineers increase texture filtering/sampling performance by 250% this generation?
 
ShaidarHaran: I can't accept your opinion. RV770 boosted texturing and aritmetic power equally. You can't say that RV770 adressed lack of texturing power, because RV770 boosted nuber of ALUs in the same way and if R600 wasn't lacking of anything, pure aritmetic rate was this thing.

In relative way, RV770 is weaker in texturing than R600, because ALU:TEX remained and TFUs were emasculated.

Again, this has nothing to do with the TEX:ALU ratio itself being out-of-balance. Base texturing capability needed to be increased over the previous generation because it was deficient given the existing shading capability. There's no denying this, benchmarks prove my point.
 
So, was RV670's SP lopsidedness ever an advantage? And will RV770's similar trait ever be an advantage? Seems that the texturing rate is what makes it faster for the most part. Are any games pushing the ratio more towards shader effects vs. texturing finally?

I always found some of the synthetic results strange for RV670, too. Such as:
http://www.digit-life.com/articles3/video/rv670-part2-page1.html
d3drmgps4mf8.png

while
image1ul0.jpg


It was amazingly quick for vertex processing, but had serious problems with some heavy duty pixel shaders. It has led me to believe that those 320 SPs weren't all that amazing compared to NV's 128 SPs. Aside from the differences in each company's way of counting (ATI actually having 64 SPs when you don't just count ALUs.)
 
Last edited by a moderator:
Why then did your engineers increase texture filtering/sampling performance by 250% this generation?
Because we wanted to scale the overall engine up and did so by increasing shader power by 250% as well.

A change in direction / "correction" would be increasing textures by 250% (or greater) but not scaling shaders by a similar ratio - is hasn't happened here. It would have been very easy to scale the engine differently (i.e. 60 SP's and 4 textures per SIMD, as opposed to 80 SP's and 4 textures) and there could have been more of these SIMD's for the same area; but this hasn't happened.

What has happened is that the entire texture and shader engine has scaled up in performance, with equal ratios to the previous generation, much like previous generations have done before.
 
Back
Top