Are PS3 devs using the two mem pools for textures?

Shifty Geezer · Jul 4, 2007

nAo said:
If you are a good GPUs architect you will design a GPU that will cope with your whole system latency, if you don't..well..they should fire you!

Ahhh, but what if you're a good GPU architect who's designed a GPU for a fast local memory bus, and then your GPU is shoe-horned into a split memory pool system it wasn't designed for? RSX is mostly regarded as a G70 'thing' put in PS3, without any real design work done, and I suppose most people think of it in such terms - it's ability to hide latency must be similar to 7800s which is based on on fast local VRAM and no reaching across to find XDR.

Jawed · Jul 4, 2007

RSX is dramatically "better" at hiding latency than G7x, for what it's worth. It's as tolerant of PS3 system RAM as G7x is tolerant of GDDR.

Jawed

AlNom · Jul 4, 2007

Ah, I am beginning to see it in a different light now. Thank you for putting up with me.

Titanio · Jul 4, 2007

Jawed said:
Jawed

Ah, thanks..

I remember this being explained to me some time ago, but I'd forgotten all about it. Thanks again..

nAo · Jul 5, 2007

Shifty Geezer said:
Ahhh, but what if you're a good GPU architect who's designed a GPU for a fast local memory bus, and then your GPU is shoe-horned into a split memory pool system it wasn't designed for? RSX is mostly regarded as a G70 'thing' put in PS3, without any real design work done, and I suppose most people think of it in such terms - it's ability to hide latency must be similar to 7800s which is based on on fast local VRAM and no reaching across to find XDR.

IF RSX simply were a 7800 you'd still to heavily redesign parts of it, I mean..PCIE and FLEXIO are not very similar, so it would have needed real design work to be done anyway.

Nesh · Jul 5, 2007

Jawed said:
Jawed

is that specifically for RSX?

Jesus2006 · Jul 5, 2007

nAo said:
IF RSX simply were a 7800 you'd still to heavily redesign parts of it, I mean..PCIE and FLEXIO are not very similar, so it would have needed real design work to be done anyway.

Thanks! That bolded 'if' pretty much cleared the issue :smile:

Nesh said:
is that specifically for RSX?

I guess not, otherwise it'd be a broken NDA...

Shifty Geezer · Jul 5, 2007

Jesus2006 said:
I guess not, otherwise it'd be a broken NDA...

Jawed isn't under an NDA. That slide could be from the leaked SDK.

DeanA · Jul 5, 2007

Shifty Geezer said:
Jawed isn't under an NDA. That slide could be from the leaked SDK.

Well, it's not a public slide, that's for sure.

Dean

Panajev2001a · Jul 5, 2007

DeanA said:
Well, it's not a public slide, that's for sure.

Dean

Well, you are only saying that because it says "proprietary and confidential" on it...

uhm...

0%...15%...100% thinking done...

I think I see your point

.

almighty · Jul 5, 2007

So i take it the latency is'nt as bad as first thought then? Coolio

Titanio · Jul 5, 2007

Nesh said:
is that specifically for RSX?

From what I've heard, I believe it is..

Basically you can increase your tolerance to latency - be it from GDDR or XDR - by increasing the number of threads, at the expense of per-thread resources.

I guess given the extra constraints on thread resources, the programmer has control over this. If you know you'll only be texturing from GDDR you can use this number of registers per thread. If you're going to be using XDR, you can use this lower number, and let the GPU increase the number of threads in flight such that if a thread is stalled waiting on its texture data, it's more likely there'll be another ready thread to switch to while the other waits.

I wonder if there'd be value in having more register-light(er) threads even if just texturing from GDDR.

It's not really about the latency not being as bad as thought..that latency is still there, but there are tools to mitigate its impact as above and keep the GPU busy.

Jawed · Jul 5, 2007

That slide is RSX specific. G7x has half these capabilities.

Jawed

Crossbar · Jul 5, 2007

Titanio said:
From what I've heard, I believe it is..

Basically you can increase your tolerance to latency - be it from GDDR or XDR - by increasing the number of threads, at the expense of per-thread resources.

I guess given the extra constraints on thread resources, the programmer has control over this. If you know you'll only be texturing from GDDR you can use this number of registers per thread. If you're going to be using XDR, you can use this lower number, and let the GPU increase the number of threads in flight such that if a thread is stalled waiting on its texture data, it's more likely there'll be another ready thread to switch to while the other waits.

So it is a kind of hyper-thread solution where the ALU-resources are shared but not the registers?

Jawed said:
That slide is RSX specific. G7x has half these capabilities.

How much impact on the transistor count could such a extension imply with regard to the shader implementation?

inefficient · Jul 5, 2007

Crossbar said:
So it is a kind of hyper-thread solution where the ALU-resources are shared but not the registers?

SMT/Hyper threading: 2 threads, 2 sets of registers, 1 set of ALUs.

This solution: 2 threads, 1 sets of registers divided in half, 1 set of ALUs.

The trick quite simply seems to be to double buffer reads from memory to hide latency. It appears to hide latency very well, but at the cost off registers. But I assume the consequence is that you should stick to to simpler shaders when texturing from XDR.

This must also mean the RSX has it's own MMU and can DMA memory directly from XDR to itself. Or maybe I am reading too much into this. But imagine if the RSX MMU could DMA data directly from the SPU Local Stores!

Love_In_Rio · Jul 5, 2007

Jawed said:
RSX is dramatically "better" at hiding latency than G7x, for what it's worth. It's as tolerant of PS3 system RAM as G7x is tolerant of GDDR.

Jawed

And then RSX is twice better than G7x hiding latency from GDDR3 ? Anyway the disparity in latency tolerance between the two pools forces whether not to use complex shaders at all or to lose several frames when using them and both pools for texturing, no ?, as it must be a headache to manage which pool to use depending of the shader intensity.

P.D: prize for the slide of the month! do you have more ???

Titanio · Jul 5, 2007

inefficient said:
The trick quite simply seems to be to double buffer reads from memory to hide latency. It appears to hide latency very well, but at the cost off registers. But I assume the consequence is that you should stick to to simpler shaders when texturing from XDR.

I think we should more specifically say 'lower-register-using shaders'

Complexity is a more..complicated thing than the number of registers you're using. I've read papers where the author stepped through the process of optimising out registers in their shaders, and I can assure you, they weren't simplifying anything

Of course, if the compiler does not do this adequately, automatically, it is more work for the developer to make manual optimisations like this. And of course, some computation can have register usage more easily reduced without affecting the final result than others.

Jawed · Jul 5, 2007

Love_In_Rio said:
And then RSX is twice better than G7x hiding latency from GDDR3 ?

It's solely about the complexity of a shader: how many registers need to be allocated in the register file. RSX is kinder on PS3 devs in this respect than G7x is. As shaders get longer they tend to use more registers. But a good compiler (or programmer writing low-level code) can "re-use" registers.

As that slide says, using "half" registers whenever possible is also a good strategy.

Anyway the disparity in latency tolerance between the two pools forces whether not to use complex shaders at all or to lose several frames when using them and both pools for texturing, no ?, as it must be a headache to manage which pool to use depending of the shader intensity.

That's why game devs are so highly paid and respected... At the same time, imagine the freedom they have knowing they're not programming a PC.

Jawed

Jawed · Jul 5, 2007

Crossbar said:
How much impact on the transistor count could such a extension imply with regard to the shader implementation?

As a minimum it's going to double the register file size. Nothing monstrous as far as RSX, overall, is concerned, less than 5% extra die, or less than 10% extra transistors, I suspect.

Jawed

archangelmorph · Jul 5, 2007

Jawed said:
That's why game devs are so highly paid and respected... At the same time, imagine the freedom they have knowing they're not programming a PC.

Jawed

Highly paid??

What devs do YOU know??

Let me know where they work and i'll drop my CV at the reception..

Are PS3 devs using the two mem pools for textures?

Shifty Geezer

uber-Troll!

Jawed

AlNom

Moderator

Titanio

nAo

Nutella Nutellae

Nesh

Double Agent

Jesus2006

Shifty Geezer

uber-Troll!

DeanA

Panajev2001a

almighty

Titanio

Jawed

Crossbar

inefficient

Love_In_Rio

Titanio

Jawed

Jawed

archangelmorph

Similar threads