AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

  • Within 1 or 2 weeks

    Votes: 1 0.6%
  • Within a month

    Votes: 5 3.2%
  • Within couple months

    Votes: 28 18.1%
  • Very late this year

    Votes: 52 33.5%
  • Not until next year

    Votes: 69 44.5%

  • Total voters
    155
  • Poll closed .
When triangles were men, sprites were men and pixel shaders had only just grown out of wearing register-combining shorts and were still learning to tie a tie in less than 64 ALU and 32 TEX, did anyone care about small triangles and getting more than one of them into a thread?

Since those days, has the basic architecture in ATI changed?

It seems to me the rasteriser is a one-triangle at a time deal - it produces 16 fragments per cycle as either 8x2 or 2x8. What chance that it can do that for 2 triangles simultaneously, in order to work at the proposed quad-level granularity?

The interpolators (R300->RV740) produce per-fragment vec4 attributes at some multiple of rasterisation rate (1x or 2x). What chance that the SPI unit can handle 2 triangles' barycentrics per clock?

It's logically possible, and even desirable. It just seems really unlikely to me, because the hardware's heritage isn't small triangles. And the really costly bits of pixel shading, the texturing and the back end, are both killed at the quad level anyway.

R700 ISA - 3.6.1 Valid and Active Masks said:
The valid mask is set for any pixel that is covered by the original primitive and has not been killed by an ALU KILL operation.
I don't know if it's reasonable to read that singular, "original primitive", as definitive, but it seems like a fair hint...

Jawed
 
The subject is that these architectures have gone from near 400 gflops -RV670- to near 2,7 gflops and the performance increase or the graphic quality jump is nowhere near that ratio. Something is not advancing at the same pace as flops.
 
The subject is that these architectures have gone from near 400 gflops -RV670- to near 2,7 gflops and the performance increase or the graphic quality jump is nowhere near that ratio. Something is not advancing at the same pace as flops.
Game engine?
















------------------------------>[]
 
Game engine?

Indeed. Although I'm sure there are some devs that would like to have a "snap of the finger auto-optimize" button in their compiler so they don't have to spend years on coding but just install the new card, push-button and have latest and greatest tech.
 
This is IMHO the right way to go. It bothers me when apps refuse to run because of a stupid check like this. I was able to run the demo now with your workaround.

Indeed, I couldn't install Froblins because I didn't have the correct OS... obviously Win7 is capable and a compatibility-mode switch worked fine. But a simple DX and GPU check would suffice no?
 
Speaking of Direct X, does anyone know why my 5870 is classed as a DDI 10.1 card in Windows 7? It doesn't seem to affect being able to run Direct X 11 stuff though.

ddi.png
 
The subject is that these architectures have gone from near 400 gflops -RV670- to near 2,7 gflops and the performance increase or the graphic quality jump is nowhere near that ratio. Something is not advancing at the same pace as flops.

3870:

496 GFLOPs/s
12.4 GTexels/s
12.4 GPixels/s
72.0 GB/s

5870:

2720 GFLOPs/s
68.0 GTexels/s
27.2 GPixels/s
153.6 GB/s

----------------------------------------

GFLOPs = 5.48x
GTexels/s = 5.48x
GPixels/s = 2.19x
GB/s = 2.13x

These are just raw paperspec numbers and considering the architectural improvements from one to the other solution render any quick conclusions drawn from these figures quickly into the sterile realm.

One thing that you can safely say is that FLOPs by themselves have never and will not define 3D performance by themselves on any GPU. The next best thing to recognize is that there's no such thing as there's enough fillrate and/or enough bandwidth. And no I'm not implying that in the case of Cypress a gazillion more bandwidth would save the day.

IHVs try to build the most balanced solution under all given constraints for each generation.

As for the applications not following any supposed FLOP increase trends, well let's have from now on games that live up to the theoretical arithmetic increases by all means. Who needs stuff like texturing, filtering or even antialiasing these days anyway?
 
Indeed. Although I'm sure there are some devs that would like to have a "snap of the finger auto-optimize" button in their compiler so they don't have to spend years on coding but just install the new card, push-button and have latest and greatest tech.

But that's hardly something new. When the Pentium came out a lot of developers stopped optimizing because that CPU was blazing fast compared to the previous ones.

The really good developers crafted their inner loops carefully to avoid AGIs and stalls, but a lot of people relied on Watcom doing its job properly.
 
But that's hardly something new. When the Pentium came out a lot of developers stopped optimizing because that CPU was blazing fast compared to the previous ones.

I don't think so, as soon as we learned about the "amazing computational power" of Pentium's we started moving all computations from integers to float for instance (retarded, yes, bu hey, I was 16.) If you're a developer, your heart stars beating faster when you hear something like that, you don't become instant-lazy.
 
Something is not advancing at the same pace as flops.
At least some of that is being spent going from a static environment to a more dynamic one.

Also going from short-cut/hacks that look pretty good to more physically accurate algorithms can eat a bunch of cycles that won't be easily distinguished by most people.
 
Hmm...
Hemlock =
"EX" HD5870 X2 = HD 5970 =
2 X Cypress XT(HD5870) @ Cypress PRO(HD5850) clocks(725/1000)...

no cuts in ROPs, SPs, but slower clocks...
...

TDP < 300 W?

bye
 
Of course it is, it wouldn't get validated with PCI SIG if it was over 300W
Frankly, if real power consumption is anything like HD 4870X2 it would be a lot more honest to just say TDP is 375W and use 2 8 pin plugs, official configuration or not. Sure the 4870X2 needed only 1 8 pin + 6 pin since its TDP was only 300W - on paper...
Though if it has lower clocks I guess there's a good chance it actually really only draws around 300W. What is HD5950 then? Only less shader arrays?
 
Back
Top