Are PS3 devs using the two mem pools for textures?

Yes, in theory, as I understand it. But that's an area I don't recall getting real clarification on. I don't know the ins and outs of how the memory is configured and used. eg. Can the RSX see the LS's as memory addresses? Still, the bi-directional Cell<>RSX interface wouldn't be much use without being able to store values from RSX somewhere for Cell to use!
 
While I agree with the memory/speed tradeoff in general (i.e. not only for latency hiding et al.), that doesn't change the fact that CPU<-VRAM (read) bandwidth may be valuable, as that was the discussion.

It is not like "Doesn't make sense or you don't need it because you have to have a trade off".

When you are writing to memory from the CPU you want to write to the fastest memory possible. Remember the Cell is 3.2Ghz and the RSX is only 500mhz. You'd be wasting 6.4x as many CPU cycles compared to the GPU when you are stalled on IO.

Thats why it makes no sense to store your results in VRAM. Store them in XDR, and let the GPU fetch them from there. You would be wasting your CPU time otherwise.

And it's not like the GPU has to make a duplicate "copy" of the data over to VRAM. It can natively read and write to the system RAM with only a small penalty.
 
And it's not like the GPU has to make a duplicate "copy" of the data over to VRAM. It can natively read and write to the system RAM with only a small penalty.

Is it that easy? I remember some multiplatform devs saying that they'd have to cut down texture resolutions on PS3 because the system only allows for 256MB of VRAM.

Was that something that has dramatically changed by newer SDKs/firmware updates or is it just an example of "lazy devs" (or is texturing from XDR a totally different thing?)?
 
Is it that easy? I remember some multiplatform devs saying that they'd have to cut down texture resolutions on PS3 because the system only allows for 256MB of VRAM.

Was that something that has dramatically changed by newer SDKs/firmware updates or is it just an example of "lazy devs" (or is texturing from XDR a totally different thing?)?

You have somehow accidentally fallen into a thread that is discussing that very subject! Start on page 1 and don't stop reading until you get your answer.
 
You have somehow accidentally fallen into a thread that is discussing that very subject! Start on page 1 and don't stop reading until you get your answer.

I did actually read the thread, but my question was more specific on the subject of texturing in contrast to "accessing XDR" (for whatever other reason, e.g. getting preprocessed data from CELL etc.) what was discussed here.

Are there any constraints specifically for texturing from XDR like those devs claimed (think it was Ubisoft/SplinterCell)?
 
If you did, you may have seen a certain slide that answers your question below rather precisely?

Awww, come on, just re-link the slide for him. :p

b3da0.jpg


Jawed

But I think he wants more than raw data that the slide shows. I think he may want context and examples.
 
When you are writing to memory from the CPU you want to write to the fastest memory possible. Remember the Cell is 3.2Ghz and the RSX is only 500mhz. You'd be wasting 6.4x as many CPU cycles compared to the GPU when you are stalled on IO.

Thats why it makes no sense to store your results in VRAM. Store them in XDR, and let the GPU fetch them from there. You would be wasting your CPU time otherwise.

The data does have to go VRAM eventually. Clock speed difference in two clock domains is irrelevantly in general (since we are talking importance of bandwidth), data transfer to slower domain doesn't have to be slower.

Normally when you are writing to memory you don't have to be stalled until memory is written although memory write is slow, you write to writeback buffer and do whatever you want until writeback buffer if full and you still want to write. That is latency hiding afterall.

So even if you don't have plenty of SPU cycles sitting around as it seems currently, your SPU should be able to do something else instead of stalling.
And it's not like the GPU has to make a duplicate "copy" of the data over to VRAM. It can natively read and write to the system RAM with only a small penalty.

Then you need to sync SPUs with RSX over a smaller data, not to mention unnecessary main memory access from SPU.
 
Would the fact that the RSX has a cache (read this in a dev interview somewhere) mean that read/writes to XDR, or Cell writes to RSX would help hide latency?

i.e. While the RSX is processing its task the RSX is doing DMA or receiving instructions from SPE's into its cache ready to be acted upon. (I guess this would depend on how big the cache is though.

If anyone is intrested the confurmation of Cache in the RSX I will see if I can find the article in which it was stated.
 
If anyone is intrested the confurmation of Cache in the RSX I will see if I can find the article in which it was stated.
The cache is given, what's interesting is how much and what kind. I doubt this is written in any public article.

By the way if anyone should still be interested, the NV2a's (xbox 1) texture cache is 128Kb Lv2 and 8Kb Lv1. And my source is secret but reliable. ;)
 
Would the fact that the RSX has a cache (read this in a dev interview somewhere) mean that read/writes to XDR, or Cell writes to RSX would help hide latency?

i.e. While the RSX is processing its task the RSX is doing DMA or receiving instructions from SPE's into its cache ready to be acted upon. (I guess this would depend on how big the cache is though.

If anyone is intrested the confurmation of Cache in the RSX I will see if I can find the article in which it was stated.


I see this sometime ago (july/2006) from guy seeking information exhaustingly on what would be the RSX :

RSX


Core Frequency - 500MHz
Memory Frequency - 650MHZ
Bus Size: 128BIT
Pixel Shaders - 24
Vertex Shaders - 8
ROPS - 8
Total Texture Cache Per Quad of Pixel Pipes (L1 & L2) - 96KB
Post Transform & Lighting Cache - 63 Max Vertices
*A few extra shader instructions - Extra Texture Lookup Logic & Fast Vector Normalize
*FLEX IO interface to CPU (Much Faster)


7800GTX


Core Frequency - 430
Bus Size: 256BIT
Memory Frequency - 600MHZ
Pixel Shaders - 24
Vertex Shaders - 8
ROPS - 16
Total Texture Cache Per Quad of Pixel Pipes (L1 & L2) - 48KB
Post Transform & Lighting Cache - 45 Max Vertices
PCI BUS interface to CPU (Much Slower)


NOTES: About the RSX.


Total Texture Cache Per Quad of Pixel Pipes - (L1 and L2) 96KB total Texture Cache - Previously 48K
(L1 only available to Pixel Shaders)

Post Transform and Lighting Vertex Cache - 63 Max Vertices - Previously 45 Vertices
(Cache located after Vertex Shader and before the triangle setup and before the Rasterizer.)
Vertex shader --> Post Transform and Lighting Vertex Cache 63MAX ---> Triangle Setup
Texture Lookup Logic to help RSX transport data from XDR


( I have heard this spec fron the others places too ...include here in this forum)
 
Last edited by a moderator:
I see this sometime ago (july/2006) from guy seeking information exhaustingly on what would be the RSX :

RSX


Core Frequency - 500MHz
Memory Frequency - 650MHZ
Bus Size: 128BIT
Pixel Shaders - 24
Vertex Shaders - 8
ROPS - 8
Total Texture Cache Per Quad of Pixel Pipes (L1 & L2) - 96KB
Post Transform & Lighting Cache - 63 Max Vertices
*A few extra shader instructions - Extra Texture Lookup Logic & Fast Vector Normalize
*FLEX IO interface to CPU (Much Faster)


7800GTX


Core Frequency - 430
Bus Size: 256BIT
Memory Frequency - 600MHZ
Pixel Shaders - 24
Vertex Shaders - 8
ROPS - 16
Total Texture Cache Per Quad of Pixel Pipes (L1 & L2) - 48KB
Post Transform & Lighting Cache - 45 Max Vertices
PCI BUS interface to CPU (Much Slower)


NOTES: About the RSX.


Total Texture Cache Per Quad of Pixel Pipes - (L1 and L2) 96KB total Texture Cache - Previously 48K
(L1 only available to Pixel Shaders)

Post Transform and Lighting Vertex Cache - 63 Max Vertices - Previously 45 Vertices
(Cache located after Vertex Shader and before the triangle setup and before the Rasterizer.)
Vertex shader --> Post Transform and Lighting Vertex Cache 63MAX ---> Triangle Setup
Texture Lookup Logic to help RSX transport data from XDR


( I have heard this spec fron the others places too ...include here in this forum)

Your RSX specs are incorrect.
 
Does anyone know why Cell is limited in reading from GDDR to 16MB/s? Is there some advantage in having a one way data path? Is it because the data transfer rates from Cell is throttled in order to ensure that RSX memory access is not interfered with, and because read from GDDR (16MB/s) is rarely required whereas write to GDDR (4GB/s) is much more useful?
 
Back
Top