Core clock speeds. Please enlighten me?

I'm guessing that GPUs aren't 100% standard-cell designs. GPUs are datapath-intensive, with many identical functional units (adders, subtracters, multiliers, etc.) These circuit blocks are 'relatively standard' in that maybe a small # of unique components spans >70-80% of the used instances in the GPU's datapath. Even just a 'semi-custom' relayout of those highly repetitive blocks could result in huge area/power-savings (which conversely could be put to upping the clock frequency.)

Not saying this is what they do (because I don't know any engineers who work there), but I wouldn't be surprised.

I didn't mean that they don't customize some of the chip, lol. I guess that's what I made it sound like, sorry.
 
Fuz said:
Thanks ppl.

Pascal,
you say that GPU's MHz are limited by AGP spec, how so? Do you mean the power available through AGP isn't enough? If thats the case, why not just add an external power source, using one of the many fan headers on motherboards.

IMHO the AGP spec for normal desktops limit the GPU in many ways:
- AGP bandwith
- AGP power
- AGP space/position for the card and cooling solutions.

Specifically about fan headers it maybe not enough, but I really dont know.

My guess there are other problems like the design cycle time with GPUs are shorter, fabrication process, etc...

What I dream is some kind of low cost XBox/PC hybrid, with a powerfull CPU/GPU/UMA with lifetime of 3 years. 8)
 
There's no reason a GPU has to contain 100 million + transistors - if it could run at 3GHz, then (in theory) it could get the same amount of work done with 1/10 as many transistors by reducing pipe costs etc., so reducing cost (but by less than you might think; faster operation costs more in silicon area because you need to separate transistors by more and shield signal connections more, and then there's the onboard caches which probably need to stay the same size, and maybe you need more pipe stages to handle the higher speed...)

Which would you rather have; four pixels/clock at 300MHz or one pixel/clock at 3GHz?

In the CPU market those cost reductions are everything because the chips have been using hand-layouts rather than library cells since the start, while the GPU market doesn't have the same issue as everyone is continuing to be equally 'lazy' and use the libraries.

Given the required time-to-market for a GPU, that's pretty much essential. If you have to go from nothing to a product in 18-24 months you can't go full custom on anything that complex. Intel / AMD have an extra year or two of design cycle.

Also, GPU's have one big advantage; they can go parallel really easily. Going from 1 pixel/clock to 2, 4, 8 etc. is easy because the adjacent pixels are in a 'similar' state. In contrast, on the x86 architecture you have to sweat blood to get over 1 instruction/clock. That is the whole point of the Itanium-type VLIW architecture - using VLIW, large register windows and clever compilers they (are supposed to) easily achieve 3-10 'instructions'/clock.
 
I don't think that that CPUs are really aiming for "minimal" transistor budgets for MHz reasons - just that throwing more transistors at a CPU has less an effect on performance than on price. I am under the impression that Intel/AMD/IBM process tech, as well as their R&D budgets, #of engineers, blow away corresponding TSMC/UMC/NVidia/ATI figures.

Most CPUs (Cyrix, P3, Alpha, Power4) are hovering around or somewhat over the 1GHz mark, with the K7 and P4 cores being the only ones significantly over that. These cores have been in development and undergone tweaking for a long a time, are produced on state of the art processes, with entire fabs dedicated to making sure that one particular core has good yields...

I don't think GPUs are going to hit 1GHz for a while, honestly...
 
psurge said:
I am under the impression that Intel/AMD/IBM process tech, as well as their R&D budgets, #of engineers, blow away corresponding TSMC/UMC/NVidia/ATI figures.
I'd say that is a safe bet. Intel increased its R&D budget this past year, and will spend a bit over 12billion dollars for this and next year, primarily for process tech.
 
Dio said:
Which would you rather have; four pixels/clock at 300MHz or one pixel/clock at 3GHz?

But, it's easier to design hardware to be more parallel and operate at the lower clock speed. Part of the reason is because there are fewer experts in physical design and it is seems to be easier to develop front end design talent. Graphics companies are improving in this area though. For example, Ati has said they achieved a 325 MHz clock rate with the R300 by doing a cpu like layout process.
 
3dcgi said:
But, it's easier to design hardware to be more parallel and operate at the lower clock speed. Part of the reason is because there are fewer experts in physical design and it is seems to be easier to develop front end design talent. Graphics companies are improving in this area though. For example, Ati has said they achieved a 325 MHz clock rate with the R300 by doing a cpu like layout process.

That's a very good point. Doing a 'CPU-like' layout requires the hardware's logic-representation (abstract 'gates' representation) to be known well in advance. For example, if I wanted to write an assembly-optimized driver, ideally I'd want to start with a *working* C-language implementation. I'm impressed ATI can pull off what they did, because that accomplishment implies the R300's front-end hardware design database was 'frozen' far in advance of the actual tape-out date. (I.e., ATI's engineers had a stable database 'snapshot' to begin hand-tuning.)
 
3dcgi said:
Dio said:
Which would you rather have; four pixels/clock at 300MHz or one pixel/clock at 3GHz?

But, it's easier to design hardware to be more parallel and operate at the lower clock speed.

I believe that was the point of the rest of what I said. :D

Dio said:
Also, GPU's have one big advantage; they can go parallel really easily. Going from 1 pixel/clock to 2, 4, 8 etc. is easy because the adjacent pixels are in a 'similar' state.
 
Back
Top