AMD: R8xx Speculation

Jawed · Oct 25, 2009

nAo said:
Well, if they don't do this tessellation + 64 pixel batches (ATI) perf is going to suck.

Hmm, well, it'd be nice to find out for sure about all this stuff.

Jawed

nAo · Oct 25, 2009

Jawed said:
Hmm, well, it'd be nice to find out for sure about all this stuff.

Sure, but it's pretty much a given they pack work for more than a single primitive per pixel batch.

Jawed · Oct 25, 2009

When triangles were men, sprites were men and pixel shaders had only just grown out of wearing register-combining shorts and were still learning to tie a tie in less than 64 ALU and 32 TEX, did anyone care about small triangles and getting more than one of them into a thread?

Since those days, has the basic architecture in ATI changed?

It seems to me the rasteriser is a one-triangle at a time deal - it produces 16 fragments per cycle as either 8x2 or 2x8. What chance that it can do that for 2 triangles simultaneously, in order to work at the proposed quad-level granularity?

The interpolators (R300->RV740) produce per-fragment vec4 attributes at some multiple of rasterisation rate (1x or 2x). What chance that the SPI unit can handle 2 triangles' barycentrics per clock?

It's logically possible, and even desirable. It just seems really unlikely to me, because the hardware's heritage isn't small triangles. And the really costly bits of pixel shading, the texturing and the back end, are both killed at the quad level anyway.

R700 ISA - 3.6.1 Valid and Active Masks said:
The valid mask is set for any pixel that is covered by the original primitive and has not been killed by an ALU KILL operation.

I don't know if it's reasonable to read that singular, "original primitive", as definitive, but it seems like a fair hint...

Jawed

Love_In_Rio · Oct 26, 2009

The subject is that these architectures have gone from near 400 gflops -RV670- to near 2,7 gflops and the performance increase or the graphic quality jump is nowhere near that ratio. Something is not advancing at the same pace as flops.

liolio · Oct 26, 2009

Love_In_Rio said:
The subject is that these architectures have gone from near 400 gflops -RV670- to near 2,7 gflops and the performance increase or the graphic quality jump is nowhere near that ratio. Something is not advancing at the same pace as flops.

Game engine?

------------------------------>[]

neliz · Oct 26, 2009

liolio said:
Game engine?

Indeed. Although I'm sure there are some devs that would like to have a "snap of the finger auto-optimize" button in their compiler so they don't have to spend years on coding but just install the new card, push-button and have latest and greatest tech.

Malo · Oct 26, 2009

Humus said:
This is IMHO the right way to go. It bothers me when apps refuse to run because of a stupid check like this. I was able to run the demo now with your workaround.

Indeed, I couldn't install Froblins because I didn't have the correct OS... obviously Win7 is capable and a compatibility-mode switch worked fine. But a simple DX and GPU check would suffice no?

Broken Hope · Oct 26, 2009

Speaking of Direct X, does anyone know why my 5870 is classed as a DDI 10.1 card in Windows 7? It doesn't seem to affect being able to run Direct X 11 stuff though.

Ailuros · Oct 27, 2009

Love_In_Rio said:
The subject is that these architectures have gone from near 400 gflops -RV670- to near 2,7 gflops and the performance increase or the graphic quality jump is nowhere near that ratio. Something is not advancing at the same pace as flops.

3870:

496 GFLOPs/s
12.4 GTexels/s
12.4 GPixels/s
72.0 GB/s

5870:

2720 GFLOPs/s
68.0 GTexels/s
27.2 GPixels/s
153.6 GB/s

----------------------------------------

GFLOPs = 5.48x
GTexels/s = 5.48x
GPixels/s = 2.19x
GB/s = 2.13x

These are just raw paperspec numbers and considering the architectural improvements from one to the other solution render any quick conclusions drawn from these figures quickly into the sterile realm.

One thing that you can safely say is that FLOPs by themselves have never and will not define 3D performance by themselves on any GPU. The next best thing to recognize is that there's no such thing as there's enough fillrate and/or enough bandwidth. And no I'm not implying that in the case of Cypress a gazillion more bandwidth would save the day.

IHVs try to build the most balanced solution under all given constraints for each generation.

As for the applications not following any supposed FLOP increase trends, well let's have from now on games that live up to the theoretical arithmetic increases by all means. Who needs stuff like texturing, filtering or even antialiasing these days anyway?

flynn · Oct 27, 2009

neliz said:
Indeed. Although I'm sure there are some devs that would like to have a "snap of the finger auto-optimize" button in their compiler so they don't have to spend years on coding but just install the new card, push-button and have latest and greatest tech.

But that's hardly something new. When the Pentium came out a lot of developers stopped optimizing because that CPU was blazing fast compared to the previous ones.

The really good developers crafted their inner loops carefully to avoid AGIs and stalls, but a lot of people relied on Watcom doing its job properly.

neliz · Oct 27, 2009

mmendez said:
But that's hardly something new. When the Pentium came out a lot of developers stopped optimizing because that CPU was blazing fast compared to the previous ones.

I don't think so, as soon as we learned about the "amazing computational power" of Pentium's we started moving all computations from integers to float for instance (retarded, yes, bu hey, I was 16.) If you're a developer, your heart stars beating faster when you hear something like that, you don't become instant-lazy.

hoom · Oct 27, 2009

Something is not advancing at the same pace as flops.

At least some of that is being spent going from a static environment to a more dynamic one.

Also going from short-cut/hacks that look pretty good to more physically accurate algorithms can eat a bunch of cycles that won't be easily distinguished by most people.

Broken Hope · Oct 27, 2009

Found this on another forum.

FrameBuffer · Oct 27, 2009

Broken Hope said:
Found this on another forum.

damn thats gotta be pushing 14"+, thank god they moved the power plugs or it could be nearing 1.5 feet.

Kaotik · Oct 27, 2009

FrameBuffer said:
damn thats gotta be pushing 14"+, thank god they moved the power plugs or it could be nearing 1.5 feet.

The whole card is IIRC 1 feet, it fits ATX specs

neliz · Oct 27, 2009

FrameBuffer said:
damn thats gotta be pushing 14"+, thank god they moved the power plugs or it could be nearing 1.5 feet.

The doofus that cut off the name also cut off the measurement and it shows a nice <13"

I hate it when people do that

Topman · Oct 27, 2009

Hmm...
Hemlock =
"EX" HD5870 X2 = HD 5970 =
2 X Cypress XT(HD5870) @ Cypress PRO(HD5850) clocks(725/1000)...

no cuts in ROPs, SPs, but slower clocks...
...

TDP < 300 W?

bye

Kaotik · Oct 28, 2009

Topman said:
TDP < 300 W?

Of course it is, it wouldn't get validated with PCI SIG if it was over 300W

mczak · Oct 28, 2009

Kaotik said:
Of course it is, it wouldn't get validated with PCI SIG if it was over 300W

Frankly, if real power consumption is anything like HD 4870X2 it would be a lot more honest to just say TDP is 375W and use 2 8 pin plugs, official configuration or not. Sure the 4870X2 needed only 1 8 pin + 6 pin since its TDP was only 300W - on paper...
Though if it has lower clocks I guess there's a good chance it actually really only draws around 300W. What is HD5950 then? Only less shader arrays?

Silent_Buddha · Oct 28, 2009

And you saw greater than 300 watt power draw from 4870x2 in what real world application? None?

Regards,
SB

AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

Within 1 or 2 weeks

Within a month

Within couple months

Very late this year

Not until next year

Jawed

nAo

Nutella Nutellae

Jawed

Love_In_Rio

liolio

Aquoiboniste

neliz

GIGABYTE Man

Malo

Yak Mechanicum

Broken Hope

Ailuros

Epsilon plus three

flynn

neliz

GIGABYTE Man

hoom

Broken Hope

FrameBuffer

Kaotik

Drunk Member

neliz

GIGABYTE Man

Topman

Kaotik

Drunk Member

mczak

Silent_Buddha

Similar threads