Why have Vertex Shaders on the CPU instead of the VPU?

Brimstone

B3D Shockwave Rider
Veteran
Why would Microsoft design the X-Box 2 with 3 dual core cpu's with vertex shaders?

Does this mean the VPU will just have pixel shaders? The VPU is rumored to have eDRAM, so is the idea to just build a chip with lots of Pixel Shading power combined with the lots of bandwidth. The trade off with eDRAM is clock speed which might not be good for Vertex Shaders. So is the design goal to leave the Vertex Shading to the CPU's which can be clocked higher and don't gain any performance from the bandwidth of eDRAM?
 
I haven't heard of this idea before. but i would agree, the idea of having vertex shaders on the cpu wouldn't be a wise one.

The only advantage I could even think of would be higher flexablility IE longer shaders/more functions but with VS 3.0 out now and the R500 potentially using it as well, the uses would be very limited and the speed trade off wouldn't be worth it.

Lets just hope it stays an idea and doesn't become a reality.
 
I haven't read anything that would indicate the Xbox 2 VPU isnt going to have Vertex Shaders. I think MS is just making sure they have robust CPU processing power, and not totally rely on the VPU for everything.
 
Jabjabs said:
I haven't heard of this idea before. but i would agree, the idea of having vertex shaders on the cpu wouldn't be a wise one.

Just to speculate and nothing more. Very abstractly, I can see a somewhat powerful argument (IMHO) for this type of architecture, an argument which is a subset of the one I made for a unified Cell architecture.

There is an upper bound on the amount of logic you can physically fit in a given area, we all know this. If, as many such as Dave Baumann suggested/hinted, you don't need such huge computational requirements in the CPU for it's traditional tasks (I think he made a comment about what you'd use a TFLLOP for in a CPU) - then that area is wasted. If you can move the Vertex Shading to the CPU and impliment an architecture which is flexible enough, say with [geometry] shaders as a subset/within the Vector front-end, you could have a powerful solution. I'd assume you can maximize the utility coming from the "CPU" logic potential while expanding the traditional back-end potentiality and maximize bandwith by keeping much transfers unidirectional without sending it back. But, please correct me if I'm wrong.

I'd still think that an architecture like what many (myself included) believe Cell to be and it's implimentation in the PS3 is much more elegent and powerful. But that's beside the point and for a different thread at a different time.

PS. I'm not saying your wrong or anything Dave. I just remember you remarking something to the effect of, 'What would you use a TFLop for anyways'
 
Huddy makes some hints at the direction being taken with Shader model 4.0 and even 3.0 in the presentation that was pulled. They make note that the shader instruction level is unified which paves the way for a unification of VS and PS in hardware, and this goes hand in hand with the CPU Power.

Developers may choose that they want to dedicate quite a lot of time to hig quality pixels, in which case with a unfied shader architecture and a CPU configuration condusive to it, all the the graphics ALU's could be decicated to pixel shading whilst the CPU does the geometry processeing - alternatively if there is some vertex procesing functionality that may not make sense on the CPU, such as Displacement mapping for example, you still have the opportunity to utilise some of the ALU's on the graphics chip to do this processing.
 
I was under impression that GPUs scheduled for XBox2 timeframe are supposed to have configurable shader units - so all the arguing about vertex this, fragment that, would become pretty much irrelevant.
You'll decide on the configuration that most suits your application needs - so if you want to move all your vertex processing to CPU and configure all shaders on the GPU for fragment processing, then that's ok too.

the uses would be very limited and the speed trade off wouldn't be worth it.
That's rather silly thing to say - depending on how architecture works there need not be speed trade offs at all. On the contrary, the CPUs could very well do certain things faster than the graphic part will.
Mind you, I don't think the Xenon cpus will ever carry brunt of graphic processing, but having a large pool of general purpose computation resource is never a bad thing.


Semi Off topic - Vince, while having a nice unified set of elements would be nice, what happens if Visualizer turns out to not be Cell based?
 
zurich said:
Bohdy said:
Well if this is the way things are going then I guess that Nintendo was thinking ahead this gen!

More like Sony was (think PS2 and VUs).

No, the topic is about Vertex Shaders being on the cpu, not vector processing in general.

The EE does all of the t&l aswell as special vector processing, while the Gekko was designed to do just the latter, Ie the role of the Vertex Shaders.
 
I'd still think that an architecture like what many (myself included) believe Cell to be and it's implimentation in the PS3 is much more elegent and powerful. But that's beside the point and for a different thread at a different time

:rolleyes:

To me elegant is a hdr not a hugely based software renderer .


Anyway i fully agree with what dave has said. Its a great direction to go into having both a robust vpu and a very powerfull cpu set up .
 
Bohdy said:
zurich said:
Bohdy said:
Well if this is the way things are going then I guess that Nintendo was thinking ahead this gen!

More like Sony was (think PS2 and VUs).

No, the topic is about Vertex Shaders being on the cpu, not vector processing in general.

The EE does all of the t&l aswell as special vector processing, while the Gekko was designed to do just the latter, Ie the role of the Vertex Shaders.

I guess one could look at it that way, if they really wanted to.
 
jvd said:
To me elegant is a hdr not a hugely based software renderer

HDR? High Dynamic Range? I don't know of this, can you explain it. Thanks.

Anyway i fully agree with what dave has said. Its a great direction to go into having both a robust vpu and a very powrfull cpu set up .

So, it's a great direction aslong as your unified computational resources are made by ATI and IBM and will most likely be less programmable and flexible. Thats alright, but as soon as you put STI in the equation and take the concept to the next step... it all goes sour. lol!

As for Faf, then I guess you'd be left with exactly what you said, a less elegent solution more akin to PS2 or XBox2 than anything we've been hearing of Cell. Just have to see...
 
Well vince if sony has a robust vpu and a very powerfull cpu then it will still be the best direction.

If they don't have a rubst vpu but only a powerfull cpu like they do with the ps2 then that is not the way to go .


I really don't know why I bother responding to you. You rank as low on my list as deadmeat and chap do.

Except they make me laugh and you make me feel bad for you .
 
No, the topic is about Vertex Shaders being on the cpu, not vector processing in general.

Huh? "Vertex Shaders" is a pretty generic term to begin with. A vertex shader could be seen as a logical stage in a 3D pipeline, a discreet hardware execution unit (itself composed of one or more vector processors or an array of scalar processors), or a piece code...

To me elegant is a hdr not a hugely based software renderer .

I'm going to have to go with Vince on the response to this... WTF does HDR have to do with anything? And where do you draw the line on what you consider a "software renderer"? Do you consider the typical title on the PS2 to be running a software renderer? What about a rendering loop that runs on a second processor? What about today's VPUs? They're executing "programs" so to speak...

Well vince if sony has a robust vpu and a very powerfull cpu then it will still be the best direction.

With regards to what? What is "best" about it? What advantages does it have over other alternatives. What disadvantages does it have vs. other alternatives? What is your criteria for a "powerful" cpu and "robust" vpu?

If they don't have a rubst vpu but only a powerfull cpu like they do with the ps2 then that is not the way to go .

Regardless of the implementation?
 
Bohdy said:
The EE does all of the t&l aswell as special vector processing, while the Gekko was designed to do just the latter, Ie the role of the Vertex Shaders.
Actually the vertex shader role is doing the entire T&L. VUs on EE do more then that though (clipping and culling among other things).

Vince said:
As for Faf, then I guess you'd be left with exactly what you said, a less elegent solution more akin to PS2 or XBox2 than anything we've been hearing of Cell. Just have to see...
Ah, but let's hypothesize a bit more from there.
While obviously a lot of people on this board think requirement for elegance is to have CPU(*n)&GPU(*n) combo, and - splitting things up into as many logical units as possible :p, what if we did move away from that instead?

Say the Visualizer is just relatively simple, but extaordinarily fast triangle renderer built for the sole purpose of going micropolygons, and BE provides the entire pool of unified shader resources.
Or if you prefer - say we have hw that is built specifically to render Reyes pipeline - not cattering to the commonly used realtime pipelines at all?
Frankly I'd find that more elegant then juggling APUs that may have differring feature sets across multiple chips, and also preferable to the usual GPU-CPU split across differing architectures.

Granted, it's probably too far fetched to consider something like that, but then we're only hypothesizing here (plus I've been wrong on what's too far fetched before :p ).
 
Or if you prefer - say we have hw that is built specifically to render Reyes pipeline - not cattering to the commonly used realtime pipelines at all?

Even if Reyes pipeline is better compare to typical SGI pipeline, will next generation, has enough grunt to take advantages offer by Reyes pipeline ? Enough grunt to offer something better than can be achieved with the typical pipeline ?

That's was my last argument regarding that.
 
Often different games (and even different parts of the same game) have very different load balances. So haveing a CPU do vertex work may be right in certain situations, which is why an architecture that can reuse processing elements is so handy.

A real situation might be an game using an army. The main portion of the game needs to render 1000's of relative simple people every frame, the VPU vertex unit is good at this kind of 'brute' force work, the pixel shader are simple but the CPU is busy doing all the AI and game code etc. Perfect place for a 'hardware' vertex shader. Now cut to a close-up cutscene, we want awesome detail, skin shaders, hair, cool lighting etc. This has heavy pixel work and vertex work but little CPU power. So use a CPU core or two to do the vertex work and use all the GPU's power for pixel work.


Of course this is even easier if you have things like HLSL where the same code can be switched from CPU to GPU with relatively little work.

Design a architecture to allow load-balances is a good thing, even if 90% of games allows use it in 'standard' configuation (CPU doing light graphics work), there will be some that achieve better results especially when you consider ports where it may not be possible to redo the entire game to soak up CPU power, at least you might get some better graphics by using the a spare core for something (procedural texture, cool lighting (raytracing etc).
 
Back
Top