PS3 GPU not fast enough.. yet?

Mintmaster said:
Sure it might be, but its role has pretty much stayed the same for a while now. I don't think its complexity has grown anywhere near as fast as everything else in a GPU.
Perhaps not, but it certainly has changed since GF3 days.

Well, not to that extent. I expect the GDDR3 bus to average maybe 70% of its peak rate because you'll have the biggest and most constant BW load (colour and z) passing through there all the time.
Yes, but in the end, some datapaths are built for redundancy - eDram buses are most typical of that - they will be voefully underutilized most of the time, but that doesn't mean they aren't necessary.
Likewise, fullduplex is unlikely to be fully utilized in both directions, but it's nice the option is there to push data fast in either direction.

You mean by tranforming the camera postion to object space and using Cell?
If there's no matrix blending, sure I guess. Does it always work out faster on SPE's than RSX? How about frustum culling?
It was sort of an inside joke actually. But as I mentioned earlier there's certainly many ways you can help out the GPU if you have fast "geometry shading" capabilities.

Laa Yosh said:
...development times... budgets...etc...
Hmm... the notion that multiple memory pools make PS3 an "unnecessarily" complex console, is rather on the silly side.
Memory segmentation has been second nature of console designs since - well, just about always - it boils down to cost efficiency and single pool of very fast memory just isn't it.
XBox1 is pretty much the only deviation I can remember - and well - it also happened to be terribly cost inefficient design.
Heck - DS has some 30 memory pools (that you get to manage manually) of different sizes and speeds, asymmetric multicore CPUs and GPUs and god knows what else, and I don't see people complaining how DS development is "too complex/expensive".
 
chachi said:
All devkits for any console have more RAM than the consumer version because they have to run code that isn't optimized and contains debugging information that the consumer code doesn't have.
Not all devkits do. It's common for them to have more and it's certainly helpful for development but there are currently next gen devkits that have the same amount of RAM as the retail boxes.
 
Urian said:
One correction for Mintmaster.

The 116.5 milions number of Xbox are vertexs, not triangles.

275 milions from RSX are triangles.
I don't think anyone answered this, but in the world of marketing 1 vertex = 1 triangle. Triangle strips are nearly 1:1.
 
Mintmaster said:
How big can this fifo be? A 20K model would easily run hundreds of polys consecutively that are culled, and this holds even more so for terrain and whole objects that get clipped. The output of a vertex shader can easily be 50-100 bytes or more. AFAIK, the post transform cache is 63 verts max on RSX.
Seems like a good situation for predicated visibility queries, although I don't know if RSX supports this feature.
 
Titanio said:
16 texture units on 360

Technically, 16 bilinear + 16 point sampled.

In terms of bandwidth, 360 is using 256GB/s for the primary fb and splitting 22.4GB/s between the CPU and GPU for texture/vertex fetch, PS3 is using up to 22.4GB/s for the primary fb and then using the rest for texture/vertex, and splitting 25.6GB/s with the CPU for the same.

Fixed. ;)
 
Last edited by a moderator:
Laa-Yosh said:
So, that's why I think that the PS3 is a very powerful and flexible console - but at the same time it's unnecessarily complex which in turn leads to oversized programmer teams, huge budgets and long development time.
I don't know if I implied otherwise, but the situation is not necessarily more straightforward on the X360. Someone with hands-on experience could add detail, but probably won't. :p



Nemo80 said:
Afaik the final devkits don't have that anymore. But do you really think that texture resolution is now higher because games run on devkits? Sorry, but that's utter bs.
There's a big, long line of Xbox games that had really good textures in shots but then suddenly tanked a couple months before launch. It wouldn't be the first time such a thing happened if that is indeed what happened.
 
Fafalada said:
they will be voefully underutilized most of the time, but that doesn't mean they aren't necessary.
Likewise, fullduplex is unlikely to be fully utilized in both directions, but it's nice the option is there to push data fast in either direction.
Yup, that's exactly my point. Some people are asking why Sony would put a 35GB/s connection between RSX and Cell if you'll rarely average over 1/10th of that. It's still useful.

You mentioned "geometry shading" a couple of times. What exactly are you referring to here, the thing going into DX10 processors? Are you doing work on something with such a unit? I thought you did Playstation stuff.
 
Mintmaster said:
You mentioned "geometry shading" a couple of times. What exactly are you referring to here, the thing going into DX10 processors? Are you doing work on something with such a unit? I thought you did Playstation stuff.
Probably Faf had in mind something even more general and broader than GS as we know it from SM4.0. Any doubt about CELL being a good canditate for the job? :) (and before someone says that: I'm aware you can texture from a geometry shader.. )
 
Last edited:
Mintmaster said:
Yup, that's exactly my point. Some people are asking why Sony would put a 35GB/s connection between RSX and Cell if you'll rarely average over 1/10th of that. It's still useful.

You linked to your own answer which i consider subjective or not very reliable if your intention is too back your own point!? ;)
 
Last edited by a moderator:
Memory segmentation has been second nature of console designs since - well, just about always - it boils down to cost efficiency and single pool of very fast memory just isn't it.
XBox1 is pretty much the only deviation I can remember - and well - it also happened to be terribly cost inefficient design.


The Xbox was expensive, but not because of the unified RAM. Because of built in HDD, and very bad business/royalty model that could not be changed. This is of course why MS did the 360 model very differently, and now owns the IP.

The more pools of memory you have, the more traces and busses you'll need, etc. For example, I suspect the motherboard on Xbox360 is saving MS a couple dollars by only using one 128 bit bus, where the PS3 model has two.
 
sonyps35 said:
The Xbox was expensive, but not because of the unified RAM. Because of built in HDD, and very bad business/royalty model that could not be changed. This is of course why MS did the 360 model very differently, and now owns the IP.

The more pools of memory you have, the more traces and busses you'll need, etc. For example, I suspect the motherboard on Xbox360 is saving MS a couple dollars by only using one 128 bit bus, where the PS3 model has two.

... I thought the X360 also had 2 128-bit busses. One from CPU to memory and one from GPU to memory... So they both have 2 128-bit busses. Only one has 2 busses going to the same memory pool, and the other has 2 busses going to 2 different pools. Added to that, they both have busses connecting CPU to GPU, although it's obvious the PS3 one is more expensive as it's really fast.
 
london-boy said:
... I thought the X360 also had 2 128-bit busses. One from CPU to memory and one from GPU to memory... So they both have 2 128-bit busses. Only one has 2 busses going to the same memory pool, and the other has 2 busses going to 2 different pools.

The memory controler is on the Xenos, the Xenon can only access memory through the Xenos. But there's the FSB anyway between Xenon and Xenos.
 
Zeross said:
The memory controler is on the Xenos, the Xenon can only access memory through the Xenos. But there's the FSB anyway between Xenon and Xenos.

Right, forgot that. Well it's obvious X360 is cheaper than PS3 wrt bandwidth, it has afterall a bit more than half the total system bandwidth than PS3, of course it's cheaper in that regard.
 
sonyps35 said:
The Xbox was expensive, but not because of the unified RAM. Because of built in HDD, and very bad business/royalty model that could not be changed. This is of course why MS did the 360 model very differently, and now owns the IP.

The more pools of memory you have, the more traces and busses you'll need, etc. For example, I suspect the motherboard on Xbox360 is saving MS a couple dollars by only using one 128 bit bus, where the PS3 model has two.

MS bought the best money can buy for them and where ready to lose a fortune.
Edit, missunderstood you
 
Zeross said:
The memory controler is on the Xenos, the Xenon can only access memory through the Xenos. But there's the FSB anyway between Xenon and Xenos.

Well PS3 has the Flex I/O to connect CELL to RSX. All in all board complexity of the PS3 seems significantly higher than for X360.

Cheers
 
... I thought the X360 also had 2 128-bit busses. One from CPU to memory and one from GPU to memory... So they both have 2 128-bit busses. Only one has 2 busses going to the same memory pool, and the other has 2 busses going to 2 different pools. Added to that, they both have busses connecting CPU to GPU, although it's obvious the PS3 one is more expensive as it's really fast.

Well, then the PS3 needs a bus from GPU-CPU I guess, so it's kinda like three busses vs two maybe?

Here's a good pic/article on the 360's mobo..,

http://www.anandtech.com/showdoc.aspx?i=2611&p=1


really shows where the priorities are, which seems to be on the CPU feeding the GPU, and the benefit of serial busses bit was interesting.
 
london-boy said:
Right, forgot that. Well it's obvious X360 is cheaper than PS3 wrt bandwidth, it has afterall a bit more than half the total system bandwidth than PS3, of course it's cheaper in that regard.

I would say "different". MS spent 80-90M transistors on eDRAM, so that would be a memory bandwidth expense, especially since it led to a two die GPU.

On Sony's part, outside the expense of the extra XDR royalties/premium, the memory chips themselves are not more expensive. If Sony had gone with 2 GDDR3 pools instead it would have been the same cost for the chips (although there would have been the cost of 2 buses instead of one and an extra memory controller). e.g. MS could have gone the route of segmenting the memory pools into 2x 256MB pools. Of course then you have to account for a memory management model that can work with such. Cell was developed with XDR in mind, and the memory controller in RSX was designed for GDDR3. Each has their own memory controller and dedicated memory type. In many ways it is like a PC: Cell has main memory, RSX video memory. It appears NV augmented the GPU memory controller to read/write through the FlexIO (instead of PCIe) to XDR. Without over simplifying it, Cell and RSX already were designed for two different memory types and they are essentially using their own pools of memory--yet since the FlexIO has such an amazing bandwidth it has been setup to allow RSX access to XDR as well.

Obviously MS saved some money (especially in the future) by not going with a 256bit bus. And having 1 memory controller helps, as well as 1 memory type. But at face value the PS3 does not appear to be seem to be increasing the cost 2x for 2x the bandwidth. Where it might bite them is in the future. Having 1 memory type should allow more consolidation. Having 2x the memory controllers, 2x 128bit buses, and 2 types of memory may make price reduction a little harder.

In this regards you can tell the 360 was designed cohesively and with price reduction in mind. Memory controller on the GPU, shared UMA, eDRAM to offload most bandwidth intensive tasks, etc... (As a side note there is also the XPS which allows the CPU to talk directly to the GPU without taking up the FSB and Memory bandwidth). Maybe the one thing MS/ATI appear to have planned for, but possibly initially a side note, is tiling. MS/ATI must have known they would need to support HD resolutions, but HDTVs have been "in the wings" for well over a decade. Having a 30MB eDRAM module is too expensive, so outright fitting a 720p 4xMSAA framebuffer into the eDRAM was not possible. Enter tiling. But it is pretty clear a 480p image fits in fine, and the troubles (at least initially) if tiling with XPS or Memexport seem to indicate maybe these features were designed for the 480p resolution. Of course maybe marketing just took over with the "720p standard" and/or "MSAA standard at 720p" bit, and of course the delays in working silicon obviously did not help much either.

It would be absolutely fascinating to know how much things changed and what went on behind the scenes during the development of both the PS3 and Xbox 360. When was 720p with AA decided to be standard? Why did NV refuse to create a new GPU design? When was NV chosen for the GPU? Who decided on pricing strategies? What ideas got tanked?

It would be very interesting to learn who presented what ideas and why certain ideas won out, and others did not. My guess there is a lot of back door politics. Sadly I don't think we ever do get the whole story on this stuff. I know we have the Cell story about STI and the Xbox Uncloaked, but those seem to leave out all the really good insider stuff.
 
Mintmaster,
like nAo explained, I just used it as placeholder term for "everything you do with geometry that VS can't (or doesn't) do". Could call it SPEShading I guess but since it's not exclusive to Cell...
Btw, from your link:
Mintmaster said:
The only part of the 3D software pipeline that Cell can generate dynamic data for and feed directly to RSX via FlexIO is vertex data
This is definately false. Heck it's not limited to just feeding RSX graphics data either.

Zeross said:
Hey don't forget the N64
IIRC N64 texture "cache" was just a scratchpad - so that puts it right there with PS2 and 360 in having to hand manage specialized pools of memory.

My personal favourite is PSP memory layout, behave like unified mem but some parts of address space are faster then others (In fact I used to hope PS3 would work in that fashion too).
 
overclocked said:
You linked to your own answer which i consider subjective or not very reliable if your intention is too back your own point!? ;)
Look again. I'm not linking anything I said, I'm linking to the expectation of high utilization of FlexIO. Here's another example. I'm not sure what your problem is. I'm saying the exact same thing Faf is.
 
Back
Top