AMD: R9xx Speculation

Of course - I don't know much about that stuff though, so the next question is, is there a fixed 0, or is it determined by the hardware?

With Nvidia you always can activate clamp, meaning fix the LOD at 0. Such a switch would be great in Catalyst drivers, then we can finally see it this has something to do with the LOD or if it is some kind of hardware limitation/bug.
 
Agreed :smile:

Now I really hope AMD uses this increased texturing/filtering power 'headroom' to disable all those 'optimizations' on High Quality setting. I'd really like to see 6900's HQ AF match NVidias HQ AF.
Don't forget though that if the VLIW-4 simds are nearly as fast as the VLIW-5 ones, the tex/alu ratio doesn't really increase (while it certainly increases in theory). Though that would also mean "practical" alu throughput increases by 50%, hence bottlenecks could shift a bit away from alu/tmu entirely.
 
Uh...I don't see how LOD can have a preset zero point. Technically the only preset is the absolute resolution of the texture/object itself. Zero only exists to tell the user if they're asking for more or less.
If you can tweak the LOD of one to look just like the other it merely states they've chosen different, arbitrary zero points.
 
Uh...I don't see how LOD can have a preset zero point. Technically the only preset is the absolute resolution of the texture/object itself. Zero only exists to tell the user if they're asking for more or less.
If you can tweak the LOD of one to look just like the other it merely states they've chosen different, arbitrary zero points.

Which is why we need texture sharpness comparison between AMD and nVidia now, too, to determine if the shimmering is indeed caused by sharper textures on AMD, and could be fixed by adjustin LOD to equal nVidias
 
So you say, it is the meximum the hardware ist capable of?

Consider the difference between the specification of the API and the possible implementation of perfect. Additionally, does the perfect implementation offer the quality you seek or is the perfect variant still short of your desire?
 
Which is why we need texture sharpness comparison between AMD and nVidia now, too, to determine if the shimmering is indeed caused by sharper textures on AMD, and could be fixed by adjustin LOD to equal nVidias

agreed.

If my gaming rig weren't essentially in storage (house guests) I'd swap my 8800 for my 5870 and take some screenshots.
 
Yeah but the current advantage in synthetics is far more than that. I think Tessmark is almost an order of magnitude. Of course, that's just synthetics.

I'd expect those to look quite a bit better (at least in certain cases) on Cayman than the touted 2x increase would suggest. Of course, I may very well be wrong.
 
The specs are nice to know although we don't know all of them. However, it's hard to gauge how that translate into gaming performance between the 6970 vs:
-5870
-6870
 
Last edited by a moderator:
Perhaps I'm reading too far into things, but the slides seem to show something else interesting as well. I imagine others thought this was the case before-hand as well, and also picked up on it, but I'll point it out if needed for Charlie-like reference later.

While the architecture diagram is misdirection showing Cayman's setup with Evergreen's structure (ala the fauxdozer die shot giving an overview using a 'shopped mix of past dies) and Cayman is likely 2x15 SIMDs, I find the predominant 'inclusion' of the 'missing' SIMD interesting. It's as if they had to show, even with this mockup, "Yes, this was supposed to have 32 SIMDs on 32nm". The Barts diagram didn't show a visible hole in the SIMD structure insinuating 1280sp, although I wouldn't know if that's a difference between the primer press event slides vs final review kit. At any rate, the insinuation would be the mid-range marvel on 28nm will use the complete structure and an ~975mhz core clock. Time for a R1000 (Southern Islands) speculation thread! I want to know how people think a 32 ROP/256-bit GF100 would stack up against an awfully similar 2048sp part.

AMD_L06.jpg


*Cue someone saying "It's too early..."
 
Although, I think it's way too early for such discussion, because of that awful high level secrecy from AMD, I think that 20** SPs is too low expectation for the next-gen top GPU.
 
Perhaps I'm reading too far into things, but the slides seem to show something else interesting as well. I imagine others thought this was the case before-hand as well, and also picked up on it, but I'll point it out if needed for Charlie-like reference later.

While the architecture diagram is misdirection showing Cayman's setup with Evergreen's structure (ala the fauxdozer die shot giving an overview using a 'shopped mix of past dies) and Cayman is likely 2x15 SIMDs, I find the predominant 'inclusion' of the 'missing' SIMD interesting. It's as if they had to show, even with this mockup, "Yes, this was supposed to have 32 SIMDs on 32nm". The Barts diagram didn't show a visible hole in the SIMD structure insinuating 1280sp, although I wouldn't know if that's a difference between the primer press event slides vs final review kit. At any rate, the insinuation would be the mid-range marvel on 28nm will use the complete structure and an ~975mhz core clock. Time for a R1000 (Southern Islands) speculation thread! I want to know how people think a 32 ROP/256-bit GF100 would stack up against an awfully similar 2048sp part.

AMD_L06.jpg


*Cue someone saying "It's too early..."

I don't understand what you're saying. They're displaying vertical dots because they don't want to show exactly how many SIMDs are featured in Cayman.
 
Only thing that is a bit slow are 32bit integer multiplications (only done in t unit before and now by combination of all 4 slots)
I guess I meant 'no fullspeed 32bit INT ops'.
The slide says:
4* 24bit MUL, ADD or MAD
2* 32bit ADD
1* 32bit MUL

I was hoping they would be doing fullrate 32bit rather than ganging the SPs.
If 32bit INT isn't used all that much this should be OK though.
 
I guess I meant 'no fullspeed 32bit INT ops'.
The slide says:
4* 24bit MUL, ADD or MAD
2* 32bit ADD
1* 32bit MUL

I was hoping they would be doing fullrate 32bit rather than ganging the SPs.
If 32bit INT isn't used all that much this should be OK though.

I know in compute, 32bit int is used often for indices into data structures.
 
Back
Top