xbox360 gpu explained...or so

I also guess the PS3 is much more powerful than the XBox360, but in the end, games are what's important. That'll decide the winner.
 
N00b said:
I don't think it has so much to do with "power". I really think it is all about the money.
Of course it is. After all, how many of these developers would have dropped the PS2 and only worked on the XBox or GameCube, given that both were effectively "more powerful" than the PS2? Developers primarily flock to Sony because they know that the PlayStation market will be the biggest and because Sony courts them with financial incentives. Nothing wrong with that, but it doesn't really prove much about the respective consoles power. Until we actually see a wide range of real games (ie. ones that you can actually purchase and play) running on real consoles (ie. final release hardware that you can buy) then it's unwise to make snap judgements on power (recent history of both console and PC market should tell you this).
 
N00b said:
I don't think it has so much to do with "power". I really think it is all about the money.

That's not what was said in that statement. Lemme quote again.

"I was shocked by how powerful the new consoles are," said Julian Eggebrecht, president of Factor 5. "They should really free our development." Apparently, the greater processing power of Sony's much-hyped new CPU 'the Cell', will allow Factor 5 to better simulate the real world making for more realistic games.

To me it seems like Factor 5 is going after power .. but of course maybe money is talking and they are going after the money instead.

US
 
Jawed said:
rwolf said:
Here is a nasty limitation.

All 48 of the ALUs are able to perform operations on either pixel or vertex data. All 48 have to be doing the same thing during the same clock cycle (pixel or vertex operations), but this can alternate from clock to clock. One cycle, all 48 ALUs can be crunching vertex data, the next, they can all be doing pixel ops, but they cannot be split in the same clock cycle.

Yep, it sounds shit to me. It makes me wonder if dynamic branching is ever going to bring improved performance. Seems unlikely.

Jawed

WTF?? Is this the great innovation of unifiying shaders??
 
Even if all those 48 ALUs are working on the same task any given cycle it doesn't seem a nasty limitation to me.
ALUs can switch thread type (vertex, pixel, etc..) each clock cycle ( I read this somewhere..) and they run at 500 Mhz, where is the problem?
Clock is so high you have all the granularity and work distribuition you can desire.
 
nAo said:
Even if all those 48 ALUs are working on the same task any given cycle it doesn't seem a nasty limitation to me.
ALUs can switch thread type (vertex, pixel, etc..) each clock cycle ( I read this somewhere..) and they run at 500 Mhz, where is the problem?
Clock is so high you have all the granularity and work distribuition you can desire.

You mean like, say, we have 25% pixel shading compared to 75% vertex shading workload in a scene and so the GPU would do n cycles pixel shading and 3xn cycles vertex shading?

Hard to imagine a viable way to do that, especially because the distribution of the workload is too unpredictable. Maybe I just don't get it, though. If anyone can explain this in simple words that would be welcome.
 
If that's really it, than I hope ATI won't use this tech for a desktop GPU. I'd hate to see nV without competition out there, 'cause even just a beefed up NV40 would run circles around this thing as far as I understand.

If that should indeed be the way it works, it'd have hitches all the time. Or not?
 
I don't see any reason why it should be slow or choppy at all.

EXCEPT in code with dynamic branches.

We're still waiting to find out the detail of this thing. There are some seriously puzzling differences from any of the scenarios discussed before.

Jawed
 
It'll be slow because you have to sort the whole stuff and buffer everything before you send it into the "pipelines". Buffering == bad.
 
_xxx_ said:
It'll be slow because you have to sort the whole stuff and buffer everything before you send it into the "pipelines". Buffering == bad.

What? No sorting required.
 
_xxx_ said:
You mean like, say, we have 25% pixel shading compared to 75% vertex shading workload in a scene and so the GPU would do n cycles pixel shading and 3xn cycles vertex shading?
Yes
Hard to imagine a viable way to do that, especially because the distribution of the workload is too unpredictable
Umh..to me it's no hard at all, you don't have to fix 'a priori" work distribution.
A stupid arbiter would just transform vertices until a internal buffer that stores transformed vertices is full, then arbiter switches ALUs to process pixels and so on, autobalancing pixels and vertexs troughput.
I hope hw is a bit smarter than that ;)

It'll be slow because you have to sort the whole stuff and buffer everything before you send it into the "pipelines". Buffering == bad.
I think you are not completely grasping what unified ALUs are doing, you don't need to sort/buffer anything.
 
I really think futuremark should develop a benchmark to measure the next generation consoles 3D performances. I dont know if they can sell it to anybody though. :)
 
I still think the "switch when your buffer is full approach" might inject more latencies, whereas an approach that tries to more evenly distribute threads and keep all buffers moving might be better. For example, what happens if all pixel pipeline oriented buffers are empty, and then you switch from vertex to pixel processing? Won't the downstream pipeline stages stall for a few cycles waiting for work to trickle in?
 
Well the vertex/pixel change should have no issues except in case where you are processing the last of the pixel data or the the last of the vertex data. The alus would starve.


Still on a typical GPU I would bet that starvation happens all the time and to a much larger extent. I would imagine that even if RSX could generate more shader operations per clock it would be the starved alus that would allow the R500 to smoke it.
 
Why would the ALUs be starved? I hardly think the PS3 will have any problem supplying ample amounts of geometry. The RSX could remove all of its pixel shaders, and still not be starved for primitives. The only question is, how fast triangle setup runs, but then again, the R500's triangle setup also presumably has limits as well, so there's no point doing more vertex shading if your triangle setup engine is already blocking you.
 
DemoCoder said:
Won't the downstream pipeline stages stall for a few cycles waiting for work to trickle in?
It would, in fact I hope the real hw is not as dumb as I envisioned in my previous post.
 
DemoCoder said:
Why would the ALUs be starved? I hardly think the PS3 will have any problem supplying ample amounts of geometry. The RSX could remove all of its pixel shaders, and still not be starved for primitives. The only question is, how fast triangle setup runs, but then again, the R500's triangle setup also presumably has limits as well, so there's no point doing more vertex shading if your triangle setup engine is already blocking you.

It is not about feeding the GPU, but utilizing the resources in the GPU effectively. The GeforceFX had terrible perfomance because it couldn 't utilize all its processing power. The ALUs in the GPU were starved and it wasn't until Nvidia wrote a decent shader compiler that the performance improved.

ATI has developed a chip that dynamically does what the shader compiler was doing and keeps the ALUs busy all the time.
 
Back
Top