PDA

View Full Version : CPU vs. GPU: The Chip War


Techno+
09-Nov-2006, 18:29
hey guys look

http://www.mercurynews.com/mld/mercurynews/business/technology/15969900.htm

It is amazing how general purpose the G80 is, and i guess soon we will see more apps making use of this. I guess soon we will only need 2-4 CPU cores for general purpose code, while the GPU will take up the rest of the applications. It will be interesting to see which chip wins, with CPUs becoming more parallel through the use of more cores, we will see who wins the massively parallel apps war.

Let the war begin.

3dilettante
09-Nov-2006, 19:53
It is amazing how general purpose the G80 is, and i guess soon we will see more apps making use of this. I guess soon we will only need 2-4 CPU cores for general purpose code, while the GPU will take up the rest of the applications. It will be interesting to see which chip wins, with CPUs becoming more parallel through the use of more cores, we will see who wins the massively parallel apps war.

Let the war begin.

Okay, I'll bite. General purpose compared to what? Can I have your definition of what that term means?

How does one type of chip "win"?

Arun
09-Nov-2006, 20:06
How does one type of chip "win"?By significantly reducing the ASPs (or volumes) of the other type of chip on the long term? :) I know that's a very arguable definition but if you think about it, if you got a cluster of GPUs to run CUDA-based stuff instead of a cluster of CPUs running an equivalent algorithm, you basically got NVIDIA(/ATI) winning volume and selling relatively high APS products, while Intel/AMD would lose volume in very high ASP markets.

I don't think we're quite at that stage yet, though. And calling it general purpose is still a massive overstatement, but it's getting there as a very usable, very fast piece of silicon for more and more workloads, definitely.


Uttar

3dilettante
09-Nov-2006, 20:29
But there's some fuzzy line somewhere, that if crossed, would make that cluster of GPUs a cluster of stream processors or CPUs that just happens to be good at graphics.

If we stripped away the interpolators, the perspective correction MULs, the special function hardware, and the filtering capabilities to make room for execution units that weren't application specific, would the G80 really be a GPU?

In such case, neither the CPU or GPU win, just a chip that for historical reasons has an innacurate name.

pascal
09-Nov-2006, 20:43
How does one type of chip "win"?The chip with the highest Dot Product of Applications x Capabilities :smile2:

Evotistic
09-Nov-2006, 21:50
Neither type of chip can truly "win". There's a fundamental rift between high-latency tolerant parallel data processing and no-latency-tolerance CPU-style single-thread processing. Architectures simply can't be good at both, so there will also be a balance of compute load between the two.

Massively parallel architectures will never be able to run your OS, for example, or handle input events, networking stacks, etc. On the other hand, GPUs will become a big help to high-compute-load tasks such as video and image filters, audio processing, database search, sorting, simulations, financial, and scientific apps. Most people don't run those apps however, and so most people will stick with GPUs for "media processing", and CPUs for "system processing." Of course, that's not to say GPUs aren't going to be playing a much bigger role in the future, that will definately happen, however CPUs will never really be "commoditized" tech.

3dilettante
09-Nov-2006, 21:52
The chip with the highest Dot Product of Applications x Capabilities :smile2:

But first we'd need to multiply the Applications vector by a Significance Matrix, in order to take into account how widespread an application is and whether it is really that important. (There are a whole lot of copies of Minesweeper out there.)

Then the Capabilities vector would need to be transformed by an matrix constructed with respect to axes of time, cost, power, versatility, and usability.

Then the results for each chip would need to be run through some kind of Apples/Oranges orthagonalization.

Then we'd have a number that tells us that it corresponds to the output of some string of calculations.

;)

archie4oz
09-Nov-2006, 22:09
It will be interesting to see which chip wins, with CPUs becoming more parallel through the use of more cores, we will see who wins the massively parallel apps war.

Let the war begin.

Who says there's going to be a winner or even a war... Both are just racing towards each other and simply fullfilling Sutherland's wheel of reincarnation...

Killer-Kris
10-Nov-2006, 00:37
Neither type of chip can truly "win". There's a fundamental rift between high-latency tolerant parallel data processing and no-latency-tolerance CPU-style single-thread processing. Architectures simply can't be good at both, so there will also be a balance of compute load between the two.

I couldn't agree more!

Sadly everything I hear is that ILP traditional-CPUs are going to be seriously neglected and that G80 is a glimps into the future of CPUs as well.


Massively parallel architectures will never be able to run your OS, for example, or handle input events, networking stacks, etc.

So Niagara/Niagara2 can't run the OS, or do any of the other things you listed? :grin:

pascal
10-Nov-2006, 02:28
But first we'd need to multiply the Applications vector by a Significance Matrix, in order to take into account how widespread an application is and whether it is really that important. (There are a whole lot of copies of Minesweeper out there.)

Then the Capabilities vector would need to be transformed by an matrix constructed with respect to axes of time, cost, power, versatility, and usability.

Then the results for each chip would need to be run through some kind of Apples/Oranges orthagonalization.

Then we'd have a number that tells us that it corresponds to the output of some string of calculations.

;)Why not simplify?
A dot product of application significance (wheight) x application performance.

My guess all this market movement fuss is just an adjustment/balance of the ratio of MFLOPS/MIPS for general purpose CPUs which should be increased by a factor of ~8 (with correspondent bandwith/latency adjustment) plus multicore in the coming years. Amdahl´s Law will take care of the rest :smile2:

Techno+
10-Nov-2006, 07:09
Okay, I'll bite. General purpose compared to what? Can I have your definition of what that term means?

How does one type of chip "win"?

sorry i made a mistake, general purpose in terms of running massively parallel jobs, like physics, scoentific apps etc

nutball
10-Nov-2006, 08:47
sorry i made a mistake, general purpose in terms of running massively parallel jobs, like physics, scoentific apps etc

Google for vector processor, Fujitsu, Cray, 1970s... and Clearspeed, FPGAs, Torrenza, Project Ultraviolet, ... there are some lessons to be learned from the past before making wild predictions about the future.

GPUs are one of a number of technologies coming along which are without douby very attractive for use in scientific computation if a) you turn your brain off and presume that what the PR says is the peak GFLOPS is guaranteed to be what it's going to for you on your code (problem), and if b) the "low, low cost" doesn't include paying the programmer to re-write your code.

The "winner" in the scientific computing market will be determined basically by how easy it is to extract significant performance gains on current codes, and on the size of those gains in real-world use. It's not at all clear which way the market is going right now.

As of the last generation GPUs were missing some pretty major pieces of functionality required for implementing many of the currently favoured algorithms in my line of work (specifically: performant random wites). That notwithstanding, making use of GPUs would require some pretty major re-architecting of codes which are ~100k+ lines of FORTRAN which have developed and proven over 10+ years. Not a change to be undertaken lightly.

Arwin
10-Nov-2006, 09:01
Regardless of its long-term implications, how much of this trend is down to the low bandwidth between the CPU and the GPU in current PC configurations? I think a lot of it. For anything that involves a lot of processing that is closely related to graphics, the bandwidth between the CPU and GPU is too limiting. So it makes sense that they move more and more of the functionality onto the GPU board, which in effect is similar to what AMD looks to be doing with ATI, which is bring the CPU and GPU close enough (maybe even on one chip) to work together efficiently, and in a more flexible manner.

So in my view, this is not a war, but an impending merger of technologies, exemplified by the company merger of AMD and ATI.

nAo
10-Nov-2006, 09:29
why is everyone looking at the hw architecture when we should look at the sw architecture, it's the programming model that dictates how the hw should work, not the other way around.
That's why GPUs are going to lead the revolution..

nutball
10-Nov-2006, 11:26
Well this is true, programmer shouldn't have to or want to care whether the FP accelerator is plugged in to the PCI-E bus, the Hypertransport link, or is co-resident on silicon with the CPU.

The slight worry being that we sleep-walk into a situation where the software solutions which have emerged from the development of GPUs become the de facto solution for time immemorial, and become impossible to shift.

It would be very nice if, from the programmers' point of view, programming a GPU was very similar to programming a shared-memory cluster of CPUs, which was similar to programming a Clearspeed, ... and so on. So the high-level application can remain as agnostic as possible of the underlying hardware giving its performance a kick in the pants. Superficially one can already achieve this, eg. if your code is based heavily around FFTs, or BLAS, as both Clearspeed and CUDA purport to support this (I don't know from personal experience to what extent this is true ... can anybody give first-hand info?). But if your code doesn't use these libraries you're SOL it seems to me if you want to write a code which is in any way future-proof.

I doubt this is going to happen though.