Are PS3 devs using the two mem pools for textures?

deathkiller · Jul 6, 2007

Is the RSX capable of transferring VRAM data directly to the local store memory of a SPU?

Shifty Geezer · Jul 6, 2007

Yes, in theory, as I understand it. But that's an area I don't recall getting real clarification on. I don't know the ins and outs of how the memory is configured and used. eg. Can the RSX see the LS's as memory addresses? Still, the bi-directional Cell<>RSX interface wouldn't be much use without being able to store values from RSX somewhere for Cell to use!

inefficient · Jul 6, 2007

betan said:
While I agree with the memory/speed tradeoff in general (i.e. not only for latency hiding et al.), that doesn't change the fact that CPU<-VRAM (read) bandwidth may be valuable, as that was the discussion.

It is not like "Doesn't make sense or you don't need it because you have to have a trade off".

When you are writing to memory from the CPU you want to write to the fastest memory possible. Remember the Cell is 3.2Ghz and the RSX is only 500mhz. You'd be wasting 6.4x as many CPU cycles compared to the GPU when you are stalled on IO.

Thats why it makes no sense to store your results in VRAM. Store them in XDR, and let the GPU fetch them from there. You would be wasting your CPU time otherwise.

And it's not like the GPU has to make a duplicate "copy" of the data over to VRAM. It can natively read and write to the system RAM with only a small penalty.

Jesus2006 · Jul 6, 2007

inefficient said:
And it's not like the GPU has to make a duplicate "copy" of the data over to VRAM. It can natively read and write to the system RAM with only a small penalty.

Is it that easy? I remember some multiplatform devs saying that they'd have to cut down texture resolutions on PS3 because the system only allows for 256MB of VRAM.

Was that something that has dramatically changed by newer SDKs/firmware updates or is it just an example of "lazy devs" (or is texturing from XDR a totally different thing?)?

inefficient · Jul 6, 2007

Jesus2006 said:
Is it that easy? I remember some multiplatform devs saying that they'd have to cut down texture resolutions on PS3 because the system only allows for 256MB of VRAM.

Was that something that has dramatically changed by newer SDKs/firmware updates or is it just an example of "lazy devs" (or is texturing from XDR a totally different thing?)?

You have somehow accidentally fallen into a thread that is discussing that very subject! Start on page 1 and don't stop reading until you get your answer.

Jesus2006 · Jul 6, 2007

inefficient said:
You have somehow accidentally fallen into a thread that is discussing that very subject! Start on page 1 and don't stop reading until you get your answer.

I did actually read the thread, but my question was more specific on the subject of texturing in contrast to "accessing XDR" (for whatever other reason, e.g. getting preprocessed data from CELL etc.) what was discussed here.

Are there any constraints specifically for texturing from XDR like those devs claimed (think it was Ubisoft/SplinterCell)?

Arwin · Jul 6, 2007

Jesus2006 said:
I did actually read the thread

If you did, you may have seen a certain slide that answers your question below rather precisely?

Are there any constraints specifically for texturing from XDR like those devs claimed (think it was Ubisoft/SplinterCell)?

Gradthrawn · Jul 6, 2007

Arwin said:
If you did, you may have seen a certain slide that answers your question below rather precisely?

Awww, come on, just re-link the slide for him.

Jawed said:
Jawed

But I think he wants more than raw data that the slide shows. I think he may want context and examples.

betan · Jul 6, 2007

inefficient said:
When you are writing to memory from the CPU you want to write to the fastest memory possible. Remember the Cell is 3.2Ghz and the RSX is only 500mhz. You'd be wasting 6.4x as many CPU cycles compared to the GPU when you are stalled on IO.

Thats why it makes no sense to store your results in VRAM. Store them in XDR, and let the GPU fetch them from there. You would be wasting your CPU time otherwise.

The data does have to go VRAM eventually. Clock speed difference in two clock domains is irrelevantly in general (since we are talking importance of bandwidth), data transfer to slower domain doesn't have to be slower.

Normally when you are writing to memory you don't have to be stalled until memory is written although memory write is slow, you write to writeback buffer and do whatever you want until writeback buffer if full and you still want to write. That is latency hiding afterall.

So even if you don't have plenty of SPU cycles sitting around as it seems currently, your SPU should be able to do something else instead of stalling.

And it's not like the GPU has to make a duplicate "copy" of the data over to VRAM. It can natively read and write to the system RAM with only a small penalty.

Then you need to sync SPUs with RSX over a smaller data, not to mention unnecessary main memory access from SPU.

nAo · Jul 6, 2007

betan said:
The data does have to go VRAM eventually.

No, it doesn't

betan · Jul 6, 2007

nAo said:
No, it doesn't

Somewhat off topic but don't you have to keep your front framebuffer in VRAM?
That would be nice.

Terarrim · Jul 6, 2007

Would the fact that the RSX has a cache (read this in a dev interview somewhere) mean that read/writes to XDR, or Cell writes to RSX would help hide latency?

i.e. While the RSX is processing its task the RSX is doing DMA or receiving instructions from SPE's into its cache ready to be acted upon. (I guess this would depend on how big the cache is though.

If anyone is intrested the confurmation of Cache in the RSX I will see if I can find the article in which it was stated.

Squeak · Jul 6, 2007

Terarrim said:
If anyone is intrested the confurmation of Cache in the RSX I will see if I can find the article in which it was stated.

The cache is given, what's interesting is how much and what kind. I doubt this is written in any public article.

By the way if anyone should still be interested, the NV2a's (xbox 1) texture cache is 128Kb Lv2 and 8Kb Lv1. And my source is secret but reliable.

nAo · Jul 6, 2007

Squeak said:
By the way if anyone should still be interested, the NV2a's (xbox 1) texture cache is 128Kb Lv2 and 8Kb Lv1. And my source is secret but reliable.

Your source as reliable as this man

Heinrich4 · Jul 7, 2007

Terarrim said:
Would the fact that the RSX has a cache (read this in a dev interview somewhere) mean that read/writes to XDR, or Cell writes to RSX would help hide latency?

i.e. While the RSX is processing its task the RSX is doing DMA or receiving instructions from SPE's into its cache ready to be acted upon. (I guess this would depend on how big the cache is though.

If anyone is intrested the confurmation of Cache in the RSX I will see if I can find the article in which it was stated.

I see this sometime ago (july/2006) from guy seeking information exhaustingly on what would be the RSX :

RSX

Core Frequency - 500MHz
Memory Frequency - 650MHZ
Bus Size: 128BIT
Pixel Shaders - 24
Vertex Shaders - 8
ROPS - 8
Total Texture Cache Per Quad of Pixel Pipes (L1 & L2) - 96KB
Post Transform & Lighting Cache - 63 Max Vertices
*A few extra shader instructions - Extra Texture Lookup Logic & Fast Vector Normalize
*FLEX IO interface to CPU (Much Faster)

7800GTX

Core Frequency - 430
Bus Size: 256BIT
Memory Frequency - 600MHZ
Pixel Shaders - 24
Vertex Shaders - 8
ROPS - 16
Total Texture Cache Per Quad of Pixel Pipes (L1 & L2) - 48KB
Post Transform & Lighting Cache - 45 Max Vertices
PCI BUS interface to CPU (Much Slower)

NOTES: About the RSX.

Total Texture Cache Per Quad of Pixel Pipes - (L1 and L2) 96KB total Texture Cache - Previously 48K
(L1 only available to Pixel Shaders)

Post Transform and Lighting Vertex Cache - 63 Max Vertices - Previously 45 Vertices
(Cache located after Vertex Shader and before the triangle setup and before the Rasterizer.)
Vertex shader --> Post Transform and Lighting Vertex Cache 63MAX ---> Triangle Setup
Texture Lookup Logic to help RSX transport data from XDR

( I have heard this spec fron the others places too ...include here in this forum)

Panajev2001a · Jul 7, 2007

nAo said:
Your source as reliable as this man

I see, so we should expect nVIDIA to abolish the penalties on texture look-ups from XDR, yes you have heard it... no penalty on texture look-ups.

Terarrim · Jul 7, 2007

Anyone have any idea what "multiway-programable pipelines" are (mentioned @ 2005 E3 in regards to RSX?)

Xenon · Jul 7, 2007

As far as the RSX is concerned, memory is contiguous.

Xenon · Jul 7, 2007

Heinrich4 said:
I see this sometime ago (july/2006) from guy seeking information exhaustingly on what would be the RSX :

RSX

Core Frequency - 500MHz
Memory Frequency - 650MHZ
Bus Size: 128BIT
Pixel Shaders - 24
Vertex Shaders - 8
ROPS - 8
Total Texture Cache Per Quad of Pixel Pipes (L1 & L2) - 96KB
Post Transform & Lighting Cache - 63 Max Vertices
*A few extra shader instructions - Extra Texture Lookup Logic & Fast Vector Normalize
*FLEX IO interface to CPU (Much Faster)

7800GTX

Core Frequency - 430
Bus Size: 256BIT
Memory Frequency - 600MHZ
Pixel Shaders - 24
Vertex Shaders - 8
ROPS - 16
Total Texture Cache Per Quad of Pixel Pipes (L1 & L2) - 48KB
Post Transform & Lighting Cache - 45 Max Vertices
PCI BUS interface to CPU (Much Slower)

NOTES: About the RSX.

Total Texture Cache Per Quad of Pixel Pipes - (L1 and L2) 96KB total Texture Cache - Previously 48K
(L1 only available to Pixel Shaders)

Post Transform and Lighting Vertex Cache - 63 Max Vertices - Previously 45 Vertices
(Cache located after Vertex Shader and before the triangle setup and before the Rasterizer.)
Vertex shader --> Post Transform and Lighting Vertex Cache 63MAX ---> Triangle Setup
Texture Lookup Logic to help RSX transport data from XDR

( I have heard this spec fron the others places too ...include here in this forum)

Your RSX specs are incorrect.

SPM · Jul 7, 2007

Does anyone know why Cell is limited in reading from GDDR to 16MB/s? Is there some advantage in having a one way data path? Is it because the data transfer rates from Cell is throttled in order to ensure that RSX memory access is not interfered with, and because read from GDDR (16MB/s) is rarely required whereas write to GDDR (4GB/s) is much more useful?

Are PS3 devs using the two mem pools for textures?

deathkiller

Shifty Geezer

uber-Troll!

inefficient

Jesus2006

inefficient

Jesus2006

Arwin

Now Officially a Top 10 Poster

Gradthrawn

betan

nAo

Nutella Nutellae

betan

Terarrim

Squeak

nAo

Nutella Nutellae

Heinrich4

Panajev2001a

Terarrim

Xenon

Xenon

SPM

Similar threads