How did arrive at that conclusion?So if the Rv770xt won't have a separate shader clock`, or core clock at 1050, it will have much less than 1 Tflop?
750 x 800 x 2 = 1.2 TFlop
edit - You might have done 750 x 480 x 2 = 720 GFlops
How did arrive at that conclusion?So if the Rv770xt won't have a separate shader clock`, or core clock at 1050, it will have much less than 1 Tflop?
For GPGPU: ~1TFLOP in ATI form costs $200 and in NVidia form costs $600.
Comparing double-precision: 300+ versus ~125 GFLOPs.
For $600 you can have 1 TFLOP of ATI's double-precision or 125 GFLOPs of NVidias
Hmm.
Jawed
Which makes it a bit of a shame that ATI didn't also release a high level programming language for GPGPU ala CUDA. But instead focused on a more low level approach to programmability.
Are you talking about the legendary ATi chip, which will be revealed in approximately 6 months ?
From R3D.w0mbat said:Why would u introduce a "terascale-engine" to a new gpu series? this only makes senes if both rv770 gpus can reach at least 1 tflops of processing power.
So it is pretty much confirmed that it will have 800 ALUs then? Sounds very nice! 1.2 Tflops is incredible esp. when compared to the 933 Gflops of G200 high end.
4 or 5 seems common, so let's go out of the box and say 6.Well you guys can always start from scratch and try to guess how many clusters RV770 has, which would be a good starting point.
5 FLOPs/ALU in both cases. How about a bit more creative math?
Excuse the typo it should have read 10 FLOPs/ALU for both hypothetical cases (96 for the first and 160 for the second). Both 480 and 800SPs could be theoretically arranged in 5 clusters.
Besides the point that the most important thing about it is that RV700 truly should yield a theoretical peak of ~1 TFLOP/s, that scenario above sounds a bit complicated to my layman's eyes. Assume you'd arrange those 480SPs in 5 clusters, you'd end up with 15 FLOPs per ALU. And no it doesn't have to be 4 or 5 clusters at any price, but that the whole number crunching stuff doesn't lead anywhere either.
By the way ATI never used to be that "fond" of MUL calls from what I recall from the past. If they'd add any single FLOP anywhere ADD would be the most likely candidate.
I knew at some point that the whole processor thing would backslap eventually LOL. If I'd start as a layman I'd say that R6x0/RV6x0 has 4 very "phat" clusters and G8x/9x 8 quite "thin" clusters. So far we've somewhat verified that GT200 contains 10 clusters.
19 days to go... and we are still in the dark
That would work, other than your math of doing 2Flops per shader but putting 3Flops in definitely makes it better.I'm going to play devil's advocate here and go with the simple, boring and realistic:
750Mhz core
96 5D processors
720 Gflops
32 TMUs
Should be more than enough for +50% or more on RV670.
Speculating that we will get a 150% increase in shaders and 100% increase in TMUs for a 25% increase in transistors is too far beyond the realm of common sense for me. That would mean RV670 was made mostly of vanilla pudding and they decided to swap it for transistors this time around.
But that goes against the TFlop target.I'm going to play devil's advocate here and go with the simple, boring and realistic:
750Mhz core
96 5D processors
720 Gflops
32 TMUs.
Should be more than enough for +50% or more on RV670.
Maybe they got rid of something?Speculating that we will get a 150% increase in shaders and 100% increase in TMUs for a 25% increase in transistors is too far beyond the realm of common sense for me. That would mean RV670 was made mostly of vanilla pudding and they decided to swap it for transistors this time around.
I think key is to work with 4850's clock (625MHz) to reach the TFlop mark ..That would work, other than your math of doing 2Flops per shader but putting 3Flops in definitely makes it better.
96*5*3*750=1.08TFlop