PDA

View Full Version : What is the difference between a Stream Processor and a GPGPU?


geekcomputing
26-Jan-2007, 01:24
Could anyone tell me the difference b/t a Stream Processor and a GPGPU in detail?

Thank you,

Anarchist4000
26-Jan-2007, 02:31
I think you're mixing those terms up a bit. GPGPU is a form of programming where you're trying to exploit massively parallel processors(GPUs). A GPU is a form of a stream processor. For instance the "Stream Processor" AMD has listed on their site is simply a 1900XT with additional ram.

geekcomputing
26-Jan-2007, 02:57
ah ok,

so what is the definition of a stream processor? Can you explain this subject more please.

Thank you,

mhouston
26-Jan-2007, 06:39
GPUs can act like a stream processor, but can go beyond a pure stream processor. The best working definition I think I can give is that a streaming processor must name all the elements it will access for each kernel invocation and it executes the same kernel across a stream of elements, often in a data parallel fashion (but a streaming processor can be serial). Others in the field might give you a slightly different variation, and if you ask a DSP guy, they will likely give you yet another definition. Wikipedia also has a version of this definition (http://en.wikipedia.org/wiki/Stream_processing) and (http://en.wikipedia.org/wiki/GPGPU). Other people view 'Stream Processing' as SIMD (much like a variant of vector processing, but without the 1 in 1 out restriction) as the DSP folks do where I tend to have a more relaxed view that it is SPMD.

When I say 'name elements' I mean that all the data can be localized before each kernel invocation or the data can be prefetched in time for execution. Where GPUs differ is that within the execution on an element you can calculate address to gather from, i.e. they can change what the access based on the data they are processing. Where things get really blury is that for GPUs to even run, you have to already localize data into the GPUs memory or into system addressable memory.

Geo
26-Jan-2007, 22:08
ah ok,

so what is the definition of a stream processor? Can you explain this subject more please.

Thank you,

Is what you're really asking "why the hell does Nvidia talk about 'stream processors' when discussing G80?"

Arun
27-Jan-2007, 08:00
And in answer to that, all AFAIK - the first reason is that it's good marketing. The second is that it partially conveys the idea of a scalar programming model to the mass market. And the third is that CUDA kinda exposes in such a way.


Uttar

Geo
27-Jan-2007, 21:51
Well, I was going to wait for his answer first. :smile: But I think a goodly bit is to convey the discontinuous nature of what came before and where we are now. We spent the better part of the last two years arguing over "what's a pipeline?" these days; by talking about "stream processors" instead they in part, IMHO, are making an effort to abandon that old language debate as useless and move on.

geekcomputing
29-Jan-2007, 02:59
I appreciate everyone trying to answer but i think some of the answers are above my head as i am not a programmer.

I'm still a bit confused on what a stream processor is.

I have a good grasp of how a cpu works in A+ tech terms.
A good grasp of how pre geforce fx cards worked (pipelines etc etc)

I understand that now the vertex and pixel shader is gone and is now replaced by unified but im just very confused on stream processors. Is that simply a fancy marketing term for unified shaders in a GPU or is a stream processor a real thing and the GPU's are just trying to evolve or emulate them?

(so confused. sorry if this is dumb but you have to ask dumb questions to learn when you do not know)

Anarchist4000
29-Jan-2007, 03:54
A stream processor is just a "CPU" if you will that receives a packet containing an action and something to perform that action on. It does whatever it's told and kicks it out the back side. It's a fairly broad term and more a style of processing than a specific device itself. Generally a stream processor will only be working on one task/program at a time whereas a CPU would potentially be working on every process running on a computer.

The way Nvidia is using it is more for marketing but it's still fairly accurate as to how they're performing some actions at times. For instance a true stream processor shouldn't have to go lookup or fetch any data. GPUs on the other hand typically involve texture lookups to get certain values. If you were using the GPU to accelerate physics or perform a GPGPU style operation it might not need the lookup and would then fit within the definition of a true stream processor.

geekcomputing
29-Jan-2007, 15:29
ah i think i have it now. Thanks to everyone for the info.

its appreciated.

JeffK
24-Feb-2007, 07:02
When I say 'name elements' I mean that all the data can be localized before each kernel invocation or the data can be prefetched in time for execution. Where GPUs differ is that within the execution on an element you can calculate address to gather from, i.e. they can change what the access based on the data they are processing. Where things get really blury is that for GPUs to even run, you have to already localize data into the GPUs memory or into system addressable memory.

So taking the Imagine processor as a typical stream processor, to bridge the gap in making this processor possess basic 'GPU shader functionality', I would need to add:

a) kernel instructions to load from a data dependent computed address
b) a large shared memory store (shared between SIMD units) where this data can be read from in a random access manner

It seems adding this would really throw a wrench in the performance of a system like the Imagine processor. I guess GPUs combat this by actively working on a larger pool of stream records? For instance, if I wanted to extend the Imagine hardware to support this kind of functionality, in order to get better performance I would need to grow the local register files to hold more stream record contexts and then swap them out on a random access memory load and work on ALU instructions from other stream records?

Are there any other tricks GPU use to tackle this memory performance problem? GPUs seem to be extremely good at hiding memory latency.

mhouston
26-Feb-2007, 04:43
You are on the right track, although Merrimac might be a better match since it has a more advanced register file design. A simplified view of a GPU is multiple, floating point capable Imagine processors with more traditional caches and multi-threading. To do multi-threading, besides adding support for score-boarding to run effieciently, you would now need much larger register files and register assignment/renaming support. You could either fix the number of threads and resources per thread (easy), or assign a certain number of threads per processor based on their resource requirements (harder, at least to do dynamically and not a program bind).

As I said, GPUs really aren't purely streaming processors in my view. As soon as you allow gathers, scatters, synchronization like G80 and others do, GPUs are a heck of a lot more like data parallel, multi-threaded processors. Meaning that the processors will all be running in SPMD, with threads in a group (warp) running in SIMD. But, they retain many of the restrictions of stream processors like no recursion. "Unified architectures" can really make things interesting by, at least in theory, switching the processors from SIMD-like loads such as fragment shading, to what have often been viewed as MIMD-like modes like vertex processing. Things like this lead to headaches if starting from a pure streaming architecture.

geekcomputing
04-Mar-2007, 22:34
what do you think about http://www.acceleware.com , nvidia just bought 15% of that company.

The products look interesting.

mhouston
05-Mar-2007, 05:43
I don't know a whole lot about it and they don't have really any info up. It looks like they have a simulation API that can be mapped to GPUs and clusters/SMPs. There are several simulation systems that seem to use their stuff, so they seem to have something. They also have a bunch of capital buildup.

rwolf
07-Mar-2007, 01:05
And in answer to that, all AFAIK - the first reason is that it's good marketing.

Excellent answer :)