RSX: Vertex input limited? *FKATCT

can you from your dev side see and auto-mount such a user linux partition and for instance make an option inside your game to stream a section of gaming action to that partition?
Short answer is no.. The long answer would need someone (else) to break their NDA.

Cheers,
Dean
 
Oooo, fuzzy memory. But I'm pretty sure SPEs have no branch prediction hardware, and branch 'prediction' is performed by the developer with hints.
I think you are right about this. Some hints are probably added by the compiler as well and some branches are transformed to conditional assignments. And when the SPU take a branch penalty it is much cheaper than on a ordinary CPU anyway. DeanoC has already stated that this is non issue for them, I think that sums it pretty much.

Acert93 said:
I think he mentioned it did, specifically when he mentioned the ability to do MSAA with FP10 and how porting this to the PS3 had the issue the FP16 wasn't compatible with MSAA and there was no time to design a shader based solution.
If this is the case for many multiplatform developers, I guess we will see a major improvement in PS3 multiplatform titles when they get tools/frameworks supporting nao32 style HDR representations. It seems like it is more the tools than the RSX it self that is an issue for joker.

Regarding the RAM space for textures, it would be interesting to know if any PS3 developer see a possibility that SPEs will be used for decompressing textures in the future with higher compression rates than what is possible with DXTC, using for example wavelet style compression techniques? Could that be a way to keep more textures available in main RAM, trading RAM space for CPU cycles?

If the 360 got some cpu cycles to spare I guess it could also benefit from this tecnhnique.
 
Regarding the RAM space for textures, it would be interesting to know if any PS3 developer see a possibility that SPEs will be used for decompressing textures in the future with higher compression rates than what is possible with DXTC, using for example wavelet style compression techniques? Could that be a way to keep more textures available in main RAM, trading RAM space for CPU cycles?
Yes, this is possible. The things to be aware of are that the PS3/360 GPUs would both require the texture to be in a natively supported format before using it. As such you would either have to decompress quite some time (ie. a number of frames) before use (which is quite doable, if your engine/assets can identify usage of these compressed textures ahead of time) into some kind of LRU cache, or have your decompression/use of textures closely tied together via some kind of GPU/CPU synchronisation method - decompressing into a smaller buffer area directly before use. Although to be honest, in this latter case, you'd probably find that you'd waste large amounts of GPU time stalling on CPU/SPU decompression of those textures... so it's probably not a good idea to do something like this.. :)

Cheers,
Dean
 
Thanks for the answer.
... so it's probably not a good idea to do something like this.. :)
Is it a correct interpretation that you above are refering to the "closely tied together via some kind of GPU/CPU synchronisation method - decompressing into a smaller buffer area directly before use" and that a decompression into a LRU cache is a more viable option in the future.

Sorry, I found your post a little ambigous with regard to this. :smile:
 
I think you are right about this. Some hints are probably added by the compiler as well and some branches are transformed to conditional assignments. And when the SPU take a branch penalty it is much cheaper than on a ordinary CPU anyway. DeanoC has already stated that this is non issue for them, I think that sums it pretty much.

I think many people equate dynamic branch prediction with branch prediction. But it does have hardware support for branch prediction - just not dynamic. Sure, this will be a feature mostly used by compilers, but it can be used by a programmer as well. That, or maybe I just misunderstand, and most people will assume that branch prediction is automatically dynamic.

IBM said:
Branch Optimization. The SPE's hardware has no dynamic branch prediction but has a special branch hint instruction, which indicates likely taken branches.
(italics mine)
http://domino.research.ibm.com/comm/research_projects.nsf/pages/cellcompiler.spe.html

Anyway, this has probably already been posted, but I came across it just now so here it is again just in case. ;)

Maximizing the power of the Cell Broadband Engine processor: 25 tips to optimal application performance
http://www-128.ibm.com/developerworks/power/library/pa-celltips1/
 
Sorry, I found your post a little ambigous with regard to this. :smile:
Heh.. Yes, I meant that preemptively decompressing into some kind of LRU cache is going to be the best way of achieving this. The unknown (to me, at least) is how fast SPU decompression of wavelet (or other) compressed images would be - and hence how many frames latency would be required to decompress a useful number of textures prior to them being ready for use by the GPU.

Sorry if I wasn't clear in my original reply..

Cheers,
Dean
 
That's correct. You basically save two 4X (or 2X) render targets (colour and depth), and only need a resolved 1X frambuffer, unless you also need to resolve the depth buffer for post processing, which is common.

as a layman;

IMHO I would be quite surprised if the Xenos eDRAM-chip would not contain a nicely featured 2D/3D graphics unit to do a lot of post processing for free (needs only a few million transistors). I would not be surprised at all when we hear in the future that the eDRAM chip contains even a little bit of talisman-voodoo (the image layer compositor) and that the X360 has an enhanced compositing DAC.


Manfred
 
The unknown (to me, at least) is how fast SPU decompression of wavelet (or other) compressed images would be - and hence how many frames latency would be required to decompress a useful number of textures prior to them being ready for use by the GPU.

IME this would largely depend on how well you can get your entropy decoding scheme to run on the SPU, all the other elements of any form of transform coding should work very well on the SPU (as long as your data is tiled appropriately for the size of your local storage). Which reminds me that I still wanted to work on decompressing directly into DXTn...
 
<Keanu Reeves>Woaaa</Keanu Reeves>

Great thread!! Lots of info, lots of dev speaking.. nice!
Continue as you were please... just wanted to write my appriciation..

toodeloo..

:cool:
 
Does the SPU job scheduler reside on one of the OS reserved SPUs?
I think it is a very small kernel on the SPUs themselves that accesses a shared job-list in main memory to extract (and create new) jobs. At least that's how I would do it... ;)
 
On the topic of plateform ,when we straight port our first Ps3 engine to X360 we had much better performances.Then ,becoming ps3 exclusive ,we did a lot of rethink and tuning (and up to date kits and libs) ,it 's now a lot much better than x360 's....
This Is normal ,when you have oportunity play with the strenghts.

Most multiplateform titles can't get that dedicated tailoring pass so both versions will compromise.Ports,depending on $$ pressure ,will even more.

WHAT?!?!?! You make games on the PS3?!?!?! :oops: I thought you just played games, not make them.
 
[maven];901278 said:
IME this would largely depend on how well you can get your entropy decoding scheme to run on the SPU, all the other elements of any form of transform coding should work very well on the SPU (as long as your data is tiled appropriately for the size of your local storage). Which reminds me that I still wanted to work on decompressing directly into DXTn...
What's Carmack's MegaTexture doing?...

Jawed
 
On the memory front of thing's regarding PS3 am i the only person that feel's that it will only be a problem with just mulitplat game's? Any developer that has a very good history of PS2 developing should have good experience of efficiently using memory.

Also this has to be one of the best and most informative threads ive read in a while, thanks to everyone for there input and keep it coming :)
 
I think you are right about this. Some hints are probably added by the compiler as well and some branches are transformed to conditional assignments. And when the SPU take a branch penalty it is much cheaper than on a ordinary CPU anyway. DeanoC has already stated that this is non issue for them, I think that sums it pretty much.

I think what was actually said was that cache misses for in-order CPUs tend to be more important than branch mispredictions. In-orders are often more limited by data latency (something the local store is good at keeping low for the SPE target workloads) than by instruction load latency.

The wording didn't read to me that branches are cheaper on the SPEs, just that other costs tend to obscure them.
 
Maybe it's just me - but I could swear this thread demonstrates pretty clearly that working well with RSX is not exactly clear to many people out there (developers or general public alike). In fact lot of the discussion sounds eerily reminiscent of 6 years ago.

Yeah déjà vu! Now can we safely assume the RSX isn't as common as the G7x brethren as most here were led to believe (and no, I am not trying to imply its got G8x anything)?

I mean even if there were graphics features stripped from it still makes my comment valid. If so there must be some good reasoning behind these decisions.

Has it been officially confirmed the RSX die size difference was due to redundancy or is it still debatable?
 
On the memory front of thing's regarding PS3 am i the only person that feel's that it will only be a problem with just mulitplat game's? Any developer that has a very good history of PS2 developing should have good experience of efficiently using memory.
Lack of memory affects all titles. It's not like PS2 multiplatform titles had low quality textures while exclusives had high-quality textures. The platform was known for poor texturing because of lack of memory (and lack of texture compression) and you can't get round that. Or to illustrate another way, imagine PS3 had 128 MB RAM versus XB360's 512. You wouldn't expect the same quality assets on PS3 as XB360 then just because their 1st party exclusives, would you? Exclusives may benefit from better workarounds, but if RAM is limiting texturing on multiplatform titles, it'll limit it on exclusives too.

Also I'm not sure about your choice of words, "only be a problem with multiplat games" when these are likely to make up some 90% of the games!
 
I think what was actually said was that cache misses for in-order CPUs tend to be more important than branch mispredictions. In-orders are often more limited by data latency (something the local store is good at keeping low for the SPE target workloads) than by instruction load latency.

The wording didn't read to me that branches are cheaper on the SPEs, just that other costs tend to obscure them.

Yeah, you are correct, the quote concerned "a variable memory access patterns" meaning plenty of cache misses. I was actually thinking of branch misses to non-cached code when I wrote that, but failed to describe it. Actually the 50+ cycle number suggested by DeanoC sounds a bit low for access to main RAM, and still too high to be the latency figure of the level 2 cache. Maybe some NDA margins in there? :smile:
 
Lack of memory affects all titles. It's not like PS2 multiplatform titles had low quality textures while exclusives had high-quality textures. The platform was known for poor texturing because of lack of memory (and lack of texture compression) and you can't get round that. Or to illustrate another way, imagine PS3 had 128 MB RAM versus XB360's 512. You wouldn't expect the same quality assets on PS3 as XB360 then just because their 1st party exclusives, would you? Exclusives may benefit from better workarounds, but if RAM is limiting texturing on multiplatform titles, it'll limit it on exclusives too.

Also I'm not sure about your choice of words, "only be a problem with multiplat games" when these are likely to make up some 90% of the games!

Excellent points in that reply Shifty, just out of wonder do you know how PS3 and 360 stack up in terms of compression types available to them?
 
Lack of memory affects all titles. It's not like PS2 multiplatform titles had low quality textures while exclusives had high-quality textures.
That is partly true but the ps2 had 32mb of memory and original xbox had 64mb of memory, that is a factor of 2x, and the xbox had a faster cpu as well.

Now it is 512mb (minus 60mb?) vs 512mb (minus 20mb?) with differences in memory organisation, pluses and minuses in bus speeds, and 7 usable cpus vs 3 cores. (There is also the possibility sony could re-evaluate the OS footprint needed while games were running and future titles might find from release 1.xx onwards, they have more memory. Firmware upgrade could be enforced as it is on PSP by bundling with the game).

So it would be impossible to say what this all boils down to even in multi-platform titles. But if you're expecting the same perceived texture disparity this generation vs ps2/xbox then the spec differences - even just in memory alone - wouldn't support such a view.
 
Back
Top