420XT at the inq

The Baron said:
There's still something to be said for brute force, and ATI seems like they simply won't even use aggressive clock speeds even if they have very efficient architectures to begin with. Considering how competitive a 9800XT would be if it had 450Mhz RAM (a 85Mhz bump, and it would be competitive in that it would win just about everything), I'm not a big fan of ATI's conservative nature in that department. Sure, a lot of it is to appease OEMs, but the idea of a R420 clocked at 550/1200 or something crazy like that makes my mouth water, as I'm sure it does to a lot of other people in the enthusiast crowd.
GDDR3 memory chip (by Micron) is rated CL5 @ 450MHz or CL8 @600MHz. So some of the advantage gained by increasing the clock rate is spent on increased latencies. Note the 256M 9800 with DDR2, or the fact that 9700's could be clocked higher with some bios's, but the actual performance went down (compared to some other bios with lower latencies and thus not as high overclock).
 
Aivansama said:
GDDR3 memory chip (by Micron) is rated CL5 @ 450MHz or CL8 @600MHz. So some of the advantage gained by increasing the clock rate is spent on increased latencies. Note the 256M 9800 with DDR2, or the fact that 9700's could be clocked higher with some bios's, but the actual performance went down (compared to some other bios with lower latencies and thus not as high overclock).
AFAIK, GDDR3 chips are highly pipelined, so even if the latency is high, the GPU can just pipeline one memory request after another while waiting for the first request to complete (very much unlike a CPU, which on a cache miss/DRAM request just stalls and looks stupid), and that way get arbitrarily close to 100% performance scaling with increasing clock speeds even at such high latencies.
 
arjan:
One thing I've not seen stated explicit enough is whether you can do an ACTIVE command to one bank while reading or writing to another on the same chip? There's enough empty time in the command interface to do it. But I wonder if you've got to "keep quiet" on the other banks for some part of the ACTIVEation time, just like with tRR.

The reason why it's relevant here? tRCD is constant (in ns) when increasing the clock. And if you have to keep quiet under that time, you can't make up for that bandidth loss by pipelining. Thus you won't get perfect performance scaling with clock.
(You could reduce the effect of it by making sure that you use each page a lot before swapping, but that's a different thing.)
 
Back
Top