Followup: GPUs as general processors

Following up the topic of "Photoshop Filters on GPU", the consences is that this is feasible. And with PS3.0 it would be a pretty good idea too. This would mean that the GPU pretty much makes an excellent media processor. Anything SIMD based, floating point heavy, the GPU should have a huge advantage over a CPU.

Does this really show NVIDIA's vision (make CPUs redundent) of a GPU + Network processor is a better PC model than CPU+GPU+etc?

Currently, the GPU is said to be 'turing complete.' That is, by computer science definition, it can 'compute' any task that current CPUs can do. It may take ridiculous amout of cycles/passes for certain calculations but it's possible. Once full branching and loop limits are remoted, the GPU should be able to do _everything_.

I'd imange that an NV40+/R400+ would have the (raw) horse power to run WindowsXP + applications. It most likely won't happen but it should be able to. What kind of case does this present for a GPU as a general processor?

At the current rate of convergence, Intel is really losing out the battle in trying to make the GPU obselete. The problem is demand in more processing power in current PCs are due to media intensive taskes. CPUs in this area are improving slowly when compared to GPUs. Since GPUs are already at the 500MHz level, soon it can DO everything the CPU can, albeit, not as fast. But when the CPU tries to do everything the GPU can, the slower pace is not by a little - it's way below real time. Put more directly, a GPU can compute current CPU taskes at slow but 'acceptable' speeds while a CPU is years away from doing GPU taskes (Doom3) at real time.

This appears to me as a compelling case for GPUs eventually making a PC less CPU-centric, since there's very little 'GENERAL' computing anyway if you look at it carefully. Even office applications (Excel) is data-centric. GPU parallelism should shine in every aspect. I can think of very few things that a CPU can do better than a well turned, GPU compiled program for the next generation.

Some Misc data that may interest you:
- A group is currently building a farm of 256 GeforceFXs to create a Supercomputer for floating point calculations.
- NVIDIA is working with the SETI@home group on many projects that will tap into the distributed power of desktop GPUs (presumebly NV30+) for data analysis. They expect the total computational power of the installed base of NV30+s to be greater than CPUs in the future.
 
I think there are some practical issues that stop this:
  • The GPU would be good for SIMD operations but only certain tasks fall into that category.
  • The branching/decision making on modern CPUs is very fast - I suspect that the branching in, say, the PS only looks fast because it can work on another pixel while the branch is performed.
  • How big can the programs be in the PS? I don't imagine they could be megabytes in size, so it'd only be "Turing complete" up to a smallish program
 
I think GPU is only suitable for highly parallel jobs, and it should be designed for these. It is unwise to put a sophisticate branch predictor on a GPU.
 
Well, the main reason Intel isn't selling only massively parallel vector processors now is that vector processing doesn't really fit in with 'casual' computing tasks for which total flexibility is the key.

The math capabilities of the video card are phenomenal - but they achieve this by being massively parallel and deeply pipelined.

Some people criticise the P4 because it's got a 20-cycle pipeline and therefore when it has to flush, it takes a long time to do so. Now think about how long it takes between the 'draw this triangle' command, and the last pixel ending up on the screen....

I always remember Jeff Minter calling the 68000 on the Jaguar a 'glorified joystick processor'. Of course, on a console, the specialised chips are the first-class travellers...

But generally I think both CPU and video card have their roles. There will be more handoff to the card, but the CPU is better suited for some tasks.
 
Does this really show NVIDIA's vision (make CPUs redundent) of a GPU + Network processor is a better PC model than CPU+GPU+etc?
Huh? Nvidias Vision?????

Microsoft is the one that pushed DX9 all the way out to include PS 3.0, With input from several companies. DX9 2.0 and 2.0+ are not that much different.

There are a lot more apps out there to be run on a PC that just Graphics design programs and games.
A group is currently building a farm of 256 GeforceFXs to create a Supercomputer for floating point calculations.
- NVIDIA is working with the SETI@home group on many projects that will tap into the distributed power of desktop GPUs (presumebly NV30+) for data analysis. They expect the total computational power of the installed base of NV30+s to be greater than CPUs in the future.
PR mumbo Jumbo.
 
Simon:
The NV30 is said to store PS programs in video memory. If that's the case, programs can be pretty big - 128MB. Also, when the next gen OS arrives, it is likely that virtualised memory will be supported for all taskes to the GPU. That is the GPU can pull triangle meshes, texture data etc. from main memory. Not just virtual textures as Carmack is wanting. When (if) this happens, there should be no limits on program size for GPUs.

Dio and pcchen
Yes the CPU is much faster for casual computing, but how much faster do you want your 'casual' computing to go? The key to building any faster system is by acclerating the worst bottleneck, not making the faster bits even faster. I don't see any advantage in running my Winword or IE on a Prescott 5GHz CPU. But I do see how GPUs can accelerate all those interesting taskes like MP3, DivX, DVD, SETI etc.

Put in another way. Let's suppose there's a NV50 with full branching and unlimited loops. It also has a basic branch predictor and virtualised memory is implemented by the OS and GPU. Suppose it's clocked around 1GHz. Couple this with a low end CPU that handles it all the taskes to do. Would this not make a much more compelling machine for the consumer than a 6Ghz Pentium6 with a integated 'EXTREME' graphics chip?

Does the consumer care more about if Winword starts at 0.01 second instead of 1 second or if sight and sound is many times more engrossing?
 
JF_Aidan_Pryde said:
Does this really show NVIDIA's vision (make CPUs redundent) of a GPU + Network processor is a better PC model than CPU+GPU+etc?
nVidia's vision has never been to make the CPU redundant. Video chips are, and will always be, processors dedicated to graphics rendering. They are poorly suited to most other tasks. Moving tasks onto the video card such as T&L, and making video cards more flexible just allows games to offload more and more tasks onto the video card, allowing for much more complex calculations to be done on the CPU.
 
Chalnoth:
Are we talking about the same NVIDIA corp here? The one that claimed that in 10 years, they will be bigger than Intel? With a line such as: "The only way the CPU can beat a GPU is by becomming a GPU," I hardly think their goal is any less than making the CPU redundent.

A cool model would be for the CPU to throw all SIMD calculations at the GPU and sort the house keeping for anything else left. With PCs being media machines in the future, it seems GPU will take more or less all the heavy lifting away from the CPU.
 
JF_Aidan_Pryde said:
Simon:
The NV30 is said to store PS programs in video memory. If that's the case, programs can be pretty big - 128MB.
But DX9 does specify limits. Besides, even on the NV30 there may be performance limitations on large programs. <shrug>

I think it's more a case of "using the right tool for the job".
 
JF_Aidan_Pryde said:
Chalnoth:
Are we talking about the same NVIDIA corp here? The one that claimed that in 10 years, they will be bigger than Intel? With a line such as: "The only way the CPU can beat a GPU is by becomming a GPU," I hardly think their goal is any less than making the CPU redundent.

I think that what Nvidia was talking about was "for graphic calculations, the only way for a CPU to beat a GPU is to become a CPU". Make that the other way around and you'll get "for typical CPU calculations, the only way for a GPU to beat a CPU is to become a CPU" :)
 
The main limitations of pixel shaders that currently prevent them from being usable as general-purpose processors are:
  • Data-dependent branching (will make is debut in PS3.0; my guess is that the architectures that implement this - NV40, R400? - won't do branch prediction and just hide the branch latency by swapping execution between active pixels)
  • Recursive function calls (useful for raytracing; requires a per-pixel stack in addition to just registers)
  • Limited program length (which already for NV30 seems to be an API artifact rather than a true hardware limitation)
  • Memory writes (difficult: poses interesting coherency problems)
Combining antialiasing with the points above will be ... interesting.
 
Chalnoth said:
JF_Aidan_Pryde said:
Does this really show NVIDIA's vision (make CPUs redundent) of a GPU + Network processor is a better PC model than CPU+GPU+etc?
nVidia's vision has never been to make the CPU redundant. Video chips are, and will always be, processors dedicated to graphics rendering. They are poorly suited to most other tasks. Moving tasks onto the video card such as T&L, and making video cards more flexible just allows games to offload more and more tasks onto the video card, allowing for much more complex calculations to be done on the CPU.
According to a Wired magazine article Nvidia wants to overthrough the CPU. Here is the title of the article.
Meet Nvidia CEO Jen-Hsun Huang, the man who plans to make the CPU obsolete.
http://www.wired.com/wired/archive/10.07/Nvidia.html
 
JF_Aidan_Pryde wrote:
Simon:
The NV30 is said to store PS programs in video memory. If that's the case, programs can be pretty big - 128MB.

The fact that the PS code can be stored in video memory doesn't mean that it can execute large programs. The published info for NV30 states that the largest a single shader program can be is 1024 instructions. Further to this it supports no flow control instructions (static or dynamic) in the pixel shader (predicates only).
 
JohnH said:
JF_Aidan_Pryde wrote:
Simon:
The NV30 is said to store PS programs in video memory. If that's the case, programs can be pretty big - 128MB.

The fact that the PS code can be stored in video memory doesn't mean that it can execute large programs. The published info for NV30 states that the largest a single shader program can be is 1024 instructions. Further to this it supports no flow control instructions (static or dynamic) in the pixel shader (predicates only).

Thanks for clearing that up. I'm not really making the NV30 the candidate, just future GPUs with these limits removed and enhancements made.
 
JohnH said:
The fact that the PS code can be stored in video memory doesn't mean that it can execute large programs. The published info for NV30 states that the largest a single shader program can be is 1024 instructions. Further to this it supports no flow control instructions (static or dynamic) in the pixel shader (predicates only).
QuadroFX does up to 2048 instructions, but we can only speculate whether it is a hardware limit or not.
 
Hellbinder[CE said:
]
Does this really show NVIDIA's vision (make CPUs redundent) of a GPU + Network processor is a better PC model than CPU+GPU+etc?
Huh? Nvidias Vision?????

Microsoft is the one that pushed DX9 all the way out to include PS 3.0, With input from several companies. DX9 2.0 and 2.0+ are not that much different.

There are a lot more apps out there to be run on a PC that just Graphics design programs and games.
A group is currently building a farm of 256 GeforceFXs to create a Supercomputer for floating point calculations.
- NVIDIA is working with the SETI@home group on many projects that will tap into the distributed power of desktop GPUs (presumebly NV30+) for data analysis. They expect the total computational power of the installed base of NV30+s to be greater than CPUs in the future.
PR mumbo Jumbo.

I think the topic starter might be referring to some comments that Nvidia's CEO made this past summer in Wired magazine. He all but admitted that they (nvidia) were going to surpass Intel & make the cpu obsolete in the modern pc.

edit: Ok I'll read the whole thread before I post :)
 
JF_Aidan_Pryde said:
Dio and pcchen
Yes the CPU is much faster for casual computing, but how much faster do you want your 'casual' computing to go?
Hence my 'glorified joystick processor' quote - at some point the CPU does virtually nothing and the custom chips do it all.

I was meaning that you should use the best chip for the job (if only for power consumption reasons).

I think CPU's (and more powerful ones) do still have a place, but there are increasingly diminishing returns. Games can eat it up, but for desktop work....
 
3dcgi said:
According to a Wired magazine article Nvidia wants to overthrough the CPU. Here is the title of the article.
Meet Nvidia CEO Jen-Hsun Huang, the man who plans to make the CPU obsolete.
http://www.wired.com/wired/archive/10.07/Nvidia.html

Might i suggest that you read the article before making any statements about it.

What they're saying is:

"The Xbox is how the computer will be built in the next 20 years. More semiconductor capacity will go to the user experience," he says. "The microprocessor will be dedicated to other things like artificial intelligence. That trend is helpful to us. It's a trend that's inevitable."
 
Bjorn said:
3dcgi said:
According to a Wired magazine article Nvidia wants to overthrough the CPU. Here is the title of the article.
Meet Nvidia CEO Jen-Hsun Huang, the man who plans to make the CPU obsolete.
http://www.wired.com/wired/archive/10.07/Nvidia.html

Might i suggest that you read the article before making any statements about it.

What they're saying is:

"The Xbox is how the computer will be built in the next 20 years. More semiconductor capacity will go to the user experience," he says. "The microprocessor will be dedicated to other things like artificial intelligence. That trend is helpful to us. It's a trend that's inevitable."

I read the entire article at the time and I got the feeling that Jen-Hsun's ultimate goal is as the title states.
 
NVidia beating Intel? That just PR talk. Just remember the problems that NVidia had working with 0.13 TMSC process. As TheInquirer likes to say´real man have fabs´ (quoting ex AMD CEO Sanders). NVidia is fabless, it just designs the chips but it doesn't produce them. It relies in a third party to produce it.

Intel on the other side isn't just CPU and chip design, they have fabs and they have their own litgographic process, that currently seems to beat anyone but perhaps IBM (already ready for 0.09 with Prescott). And they spend a lot of money keeping ahead in lithographic technology and researching alternatives to current methods (much like IBM). There is no way that NVidia can fight against that.

And about GPU beating CPU. It could be very well all the way around. What will do you when you have 1 billion transistors per chip. Cache? That seems the way in the future Itaniums (as far as 12 MB L3 caches and half a billion transistors). But what happens when you have a 10 billion transistor budget, or 100 billion, or a trillion? If you have enough transistors to spend you can put specialized hardware there, much like the move from a separated chip coprocessor for FP that was moved into the die. Early caches were also an external SRAM chip. There will be a day that a chip will be so ´large´ in transistors that will engulf almost everything from the computer.

GPUs are specialized hardware that rely in massive parallelism, a deep pipelining and a large number of FP units. Even if not oriented for graphics you could take a look to the Tarantula research for a on die vector coprocessor in the now dead-before-live EV8. The research papers shows that it was a FP beast (the chip would have also a large L2 cache). Maybe something in that way, or adapted for graphics (I'm not sure a vector architecture fits well for something like triangle setup and rasterization, but it could be very good for shaders though) could bring death to GPUs.

And maybe the reason Intel is keeping outside the GPU business just releasing chipsets with crap 3D support. Perhaps they think that someday technology will allow CPUs (general purpose hardware with some special purpose limited capabilities) catch up with GPUs (specialized hardware) as much the way that server and old vector cpu architectures get killed buy commodity CPUs.

And adding branching to PS or VS isn't just near what a CPU does. A modern CPU has branch prediction, memory virtualization, complex and fast caches, out of order execution, multithreading, and a lot of other stuff (like for example cache coherency protocols for SMPs) that make but easy to design them. And they are not designed using standard cells or other 'automatic' methodologies but using near to single transistor hand tunning (Alpha seemed to do that). That explains may be why Intel and other companies try to just copy rather than add design using larger caches or dual cores (Power4), because is way harder to design fast CPUs.
 
Back
Top