AMD: R9xx Speculation

Mintmaster · Nov 23, 2010

UniversalTruth said:
That's a very interesting claim, 'cos people like me don't upgrade ev'ry generation. So, this "power" will be helpful for the upcoming game releases.

We went ~8 years with one tri per clock (I think R300 was the first with that), so I don't think 2 tris per clock is going to be a problem anytime soon. Considering setup scaling issues and lower core clock speed of GF100/110, Cayman's real world disadvantage should be minimal.

EduardoS · Nov 23, 2010

hoom said:
Perhaps for a few but when you have 1920 of them?

Still cheap, maybe 2 million transtistors or so, compare that to the 2 billion transistors those chips have.

mczak · Nov 23, 2010

RecessionCone said:
Actually, Fermi has full rate 32-bit int add operations. I just wrote a CUDA kernel to test it out on my GTX 480, and got 644 Giga integer adds/second. The full-rate peak would be 1.4 GHz * 480 SMs= 672 Giga integer adds/second.

Trying the same kernel out with 32-bit int mul operations gave 331 Giga integer muls/second, which does appear to be half rate.

That's pretty interesting (though not really thread-relevant I admit), contradicting these results http://www.beyond3d.com/content/reviews/55/11 - would also mean the conclusion how int math is implemented would be wrong.

hoom · Nov 23, 2010

2million * 1920 = 3.84billion

EduardoS · Nov 23, 2010

2 million for the 1920 adders, not for one.

hoom · Nov 23, 2010

1000 transistors each?

I don't think so.

neliz · Nov 23, 2010

caveman-jim said:
Are those different?

Yes, very.

Kaotik · Nov 23, 2010

nevermind

Alexko · Nov 23, 2010

caveman-jim said:
Are those different?

I don't know that for a fact, but since the presentation doesn't give detailed specs (no exact shader count, BW, clocks, power) I'm thinking there might be two distinct dates, one for an architecture reveal and one for benchmarks and precise specs.

It could be just AMD trying to avoid leaks, though.

Edit: oh, just saw Neliz's post.

ZerazaX · Nov 23, 2010

It appears pretty simple actually:

the leaked presentation is dated October 2010 at the bottom, with NDA date of 11/22

We know that things were delayed/pushed back, and the NDA was apparently pushed as well. If we look at the leaked 6990 slide, if true, they held another presentation November 18, 2010... where the specs were probably revealed.

So the TBD specs were likely meant to be disclosed far later (11/18) near launch, but since the cards were pushed back a couple of weeks, the official specs and changes remain under NDA

Gipsel · Nov 23, 2010

hoom said:
How and why would they make such a mistake?

Because the slides were put together in a hurry?
Why do they contain the same error (actually is it open to interpretation, it's just not precise) as the Cypress launch presentation slides (implying it can do 2 DP muls, while it can do really 2 adds or 1 mul in DP)? That was only corrected in later slides, but obviosly they used some cut'n paste from older ones.

hoom said:
They use the Mantissa part of the FP unit only can do 24bit INT unless they have 48bit FP capability.

FMA?

Jawed · Nov 23, 2010

Remember that single precision FMA only exists where double precision exists (which always has FMA).

Separately my theory is that 32-bit int ADD per lane is possible because exponent processing requires addition, so the combination of mantissa and exponent processing delivers the requisite 32-bit capability for integer ADD (with a bit carried from mantissa into exponent, i.e exponent handles the upper 8 bits).

Mintmaster · Nov 23, 2010

hoom said:
1000 transistors each? I don't think so.

When all the big space consumers (data routing, flow control, registers, pipelining, etc) are already in place, and you have 24 bit adders, yeah, marginal cost should be less than that. 8 more full adders will need under 150 trannies, and since it's got as much time to finish as a FP32 MAD you don't need any carry lookahead.

Raw math is a lot cheaper than you think.

Megadrive1988 · Nov 23, 2010

Based on what's known about Cayman now, is it disappointing, good, or better than expected?

fellix · Nov 23, 2010

Well, the doubled triangle setup rate was more or less expected -- probably nothing less that that, in the face of Fermi's geometry showcase. But I guess the 32nm cancellation broke a lot of the more optimistic expectations across the board.

Pressure · Nov 23, 2010

Megadrive1988 said:
Based on what's known about Cayman now, is it disappointing, good, or better than expected?

I'd say 1920 shaders (4D arch) is exactly what people expected, although the memory bandwidth seems a bit low if the computational power went up by the amounts we hope.

UniversalTruth · Nov 23, 2010

I don't think both AMD and nVidia can rely forever on pouring more and more raw power and on new technology processes. So they have to improve their architectures in order to offer more performance. According to me Cayman is what is expected paper specification wise, but, I honestly hope that it will not show disappointing* real world performance...

*something similar to the one-year-old Hemlock.

no-X · Nov 23, 2010

Pressure: considering the 6990 slide is a fake, it isn't publicly known how fast GDDR5 modules will be used...

Megadrive1988: Improved ROPs, CSAA and power management at this level are all unexpected things...
"Not-decoupled" TMUs means, that the GPU hasn't 5:1 ALU:TEX, so it will have more texturing power, than expected. I'd say it's quite positive - not breathtaking, but slightly more positive, than expected.

UniversalTruth · Nov 23, 2010

Slightly off, but, do we expect Cayman to be able to run 3D Mark 2011 smoothly? And, one more thing. WTH does that stupid physics test in it? Is it strictly nVidia oriented or this physics is anything different?

tannat · Nov 23, 2010

UniversalTruth said:
Slightly off, but, do we expect Cayman to be able to run 3D Mark 2011 smoothly? And, one more thing. WTH does that stupid physics test in it? Is it strictly nVidia oriented or this physics is anything different?

The physics test will be bullet based I think.

AMD: R9xx Speculation

Mintmaster

EduardoS

mczak

hoom

EduardoS

hoom

neliz

GIGABYTE Man

Kaotik

Drunk Member

Alexko

ZerazaX

Gipsel

Jawed

Mintmaster

Megadrive1988

fellix

Pressure

UniversalTruth

no-X

UniversalTruth

tannat

Similar threads