DeanoC blog PS3 memory management

mckmas8808 · Feb 18, 2006

Barbarian said:
RSX can texture from both Vram and main ram, but texturing from main ram is twice slower. Output is always to Vram as far as I know.

And that's the interesting part. How will NT deal with the fact that texturing from one memory pool will be slower than the other? Interesting if you ask me.

[maven] · Feb 18, 2006

mckmas8808 said:
So is he saying that the PS3 devkit is lacking it or that the devkit has it?

/cries

People, calm down and stop trying to see some hidden statement or uniform truth dissipated by "people in the know".

E.g. when DeanoC says "300mb of data for a room" it does not necessarily imply any of the following
- PS3 needs 4GB of memory to make the game work
- the 256mb/256mb split is making things harder / easier. Who said all 300mb are graphics assets?
- that BD-ROM is fast / slow
- the console has / doesn't have a HDD
- Sony's online service will be better / worse than Live
- ...

I agree with DeanoC though, that low-level knowledge is becoming less and less common, but is needed especially at the bring-up on new hardware / APIs / systems.
If it's any consolation, for our CFD solver at work, the check-in process is similarly lax and many people are also lacking the respect for the gravity of making the central built fail. I think that's partly due to the aging CVS, but doing a built for every patch (as darcs does it when configured to do so), is not really feasible for a large project, and as ERP says, correctness in a game is not well-defined.

Has anyone else managed to write a TSR clock for DOS that is smaller than 81 bytes?

ihamoitc2005 · Feb 18, 2006

Stress

ERP said:
Basically it's a process where by you have a build machine that continually builds and tests everytime someone checks code in, when it fails it sends out an email to everyone something along the lines of

BUILD BROKEN --- Culprit is AWanker

We do that, but it isn't really sufficient, we simply don't have adequate tests and more often than not even a QA cycle misses major issues, that build gets released and the content creators who are touching the broken stuff either use the old build or sit around on their thumbs until a new build gets pushed.

Games are just really complicated and sometimes it's not even clear what the current correct behavior even is.

It is amazing for me to see big complicated games with no big problems. Always amazing. Very good talent in programmer field. But I am sorry to say that development job is too much stress for me so I that is why I am not developer. I think maybe one dayI want to work in small coffee shop and make paintings in free time. Very relaxing.

pipo · Feb 18, 2006

ihamoitc2005 said:
I think maybe one dayI want to work in small coffee shop and make paintings in free time. Very relaxing.

Quaz51 · Feb 18, 2006

What's the level of flexibility for the framebuffer in a PS3's game?
it's possible to put color/fragment buffer in XDR and Z-buffer in GDDR or inverse ?

Panajev2001a · Feb 18, 2006

Barbarian said:
RSX can texture from both Vram and main ram, but texturing from main ram is twice slower. Output is always to Vram as far as I know.

Texturing from XDR has that performance hit, but that is not the best case but worst case scenario in which the texture's texels are nto already inside RSX in some form, right ?

Rendering off-screen surfaces can be set to XDR (given nVIDIA's C.E.O.'s presentation at E3 it seemed like it could write and read anywhere in system's memory).

Still, even at 1/2th the speed if you can do frame-buffer + texturing operations from VRAM and at the same time you can also do texturing operations from XDR in parallel then you might really push the combined FlexIO + VRAM bandwidth quite well

.

mckmas8808 · Feb 18, 2006

[maven] said:
/cries

People, calm down and stop trying to see some hidden statement or uniform truth dissipated by "people in the know".

E.g. when DeanoC says "300mb of data for a room" it does not necessarily imply any of the following
- PS3 needs 4GB of memory to make the game work
- the 256mb/256mb split is making things harder / easier. Who said all 300mb are graphics assets?
- that BD-ROM is fast / slow
- the console has / doesn't have a HDD
- Sony's online service will be better / worse than Live
- ...

Blame Sony! With their lack of information people like me are searching for information. I'm sorry for making people cry though, but I'm doing it everyday over the stupid hidden non news that Sony is making us go though.

Guden Oden · Feb 18, 2006

Griffith said:
your point A is a very ineducated one
first, 360 IS UMA

Dude, you can point to google all you like, but it still doesn't mean anything because google isn't an authorative source on this subject. Just because some site claims the 360 is an unified memory system doesn't make it so, and it isn't.

second, unified memory means that you can address directly and store cpu, graphical, audio data, NOT that the system has only 1 type - 1 kind - 1 pool of memory

Actually, the second is what unified memory REALLY MEANS. That's what separates it from systems with separate memory pools, THAT'S THE WHOLE POINT. Check out the SGI O2 series of workstations for example.

If your sense of logic still isn't functioning, just think of what the word "unified" really means...

cache, registry, hard disk, gddr, edram, memory cards, flash memory are ALL 'memory'

Cache isn't considered memory because it isn't addressable, and harddrives definitely don't fit the standard definition of memory.

another ineducated claim is the "aggregate bandwidth" thing

If I was a mean person, I'd ask you to learn to spell before trying to show how 'ineducated' I am.

storing and using system memory via cell-flexio is inefficient compared to a direct acces as the local 256 MB GDDR3 is

Perhaps; you however don't seem to be in a position to measure exactly how inefficient (or not) it is, you strike me as an essentially clueless layperson who babble and use big words in a way they're not meant to be used because you don't really know better, but use them still in order to look more important. Remember, on this board you will be talking to people who actually develop on these systems, so I recommend you step lightly...

Even if we assume accessing GDDR3 across flexio is going to be inefficient, it's still going to give greater system bandwidth compared to a unified memory system with just one pool of RAM, which is exactly my original claim. One that you was unsuccessful in proving wrong I might add.

add to this lower efficiency the well Known big latencies problem of RAMBUS and explain me the meaning what kind of performance advantage can you get

Rambus is the name of the company. If you're referring to specific rambus-developed technologies then please say so, and state which one(s) you're talking about, preferably clearly. A broad-sweeping, nonsensical sentence like the one above just makes you look ignorant. RDRAM, as implemented in the N64 for example was extremely high-latency, but that's ancient, obsolete crap by now. DRDRAM as implemented in RIMM form in PCs had some latency issues due to the very long signal path of the memory interface (and technical limitations in the DRDRAM interface) for example. DRDRAM as implemented in PS2 on the other hand does not seem to be a significant performance bottleneck from what devs here have stated, and on a theoretical level ought to give very good latencies actually as signal paths are very short.

XDR memory and FlexIO as used in PS3 have no currently known latency issues, so where your lofty claim of 'well known problems' come from I've no idea. Supply some links as evidence if you want to be taken seriously please. You could be right of course, but if so it's going to be because of sheer coincidence. This stuff is so new and hasn't been widely used yet, saying there are well known latency problems with XDR and FlexIO is purest nonsense. If there's any commercial products out at all (apart from cell) using these technologies it's almost certainly going to be in things like high-performance switches and routers and such equipment, of which very few people know to any deeper levels.

and again, the frame buffer MUST to be in one memory, you can't, can't use the flexio to speedup none of the FB operations, I talk of bandwidth killer ops as like HDR, AA, Z, and filling FB

You've no idea if it's possible or not. If you did, you'd been required to sign an NDA and couldn't talk about it here anyway.

no, those ops will cut in two the bandwidth of GDDR3 local mem

Nonsense. Available bandwidth will of course be the same regardless. Perhaps you really meant to say that bandwidth consumption would increase by a factor of two.

I remember that the bus is still a 128 bit one, this add a problem to the problems

As opposed to in the 360, where the main memory interface is HOW wide again...?

I fail to see how the width of the GPU memory can be a 'problem'. Perhaps you can explain this more clearly, since I seem to be so 'ineducated'...

Jawed · Feb 18, 2006

What's the address range in XB360's EDRAM?

Jawed

Barbarian · Feb 19, 2006

mckmas8808 said:
And that's the interesting part. How will NT deal with the fact that texturing from one memory pool will be slower than the other? Interesting if you ask me.

As far as I know the latency can be hidden by the hardware threading, but the pixel shaders have to use half the registers (compared to shaders when texturing from vram) otherwise there will be a performance degradation. It seems registers are shared between all hardware threads and more threads are needed to hide longer latencies, hence less registers available per thread.

flick556 · Feb 19, 2006

Barbarian said:
As far as I know the latency can be hidden by the hardware threading, but the pixel shaders have to use half the registers (compared to shaders when texturing from vram) otherwise there will be a performance degradation. It seems registers are shared between all hardware threads and more threads are needed to hide longer latencies, hence less registers available per thread.

This does not seem to be a bad trade. Though I'm sure thier is more details. For example all regesters cannot be considered equal.

mckmas8808 · Feb 19, 2006

Barbarian said:
As far as I know the latency can be hidden by the hardware threading, but the pixel shaders have to use half the registers (compared to shaders when texturing from vram) otherwise there will be a performance degradation. It seems registers are shared between all hardware threads and more threads are needed to hide longer latencies, hence less registers available per thread.

Hey devs is this true? If so that to me does sound like a horrible trade off. Is it bad for you guys that actually program games?

ROG27 · Feb 19, 2006

Griffith said:
your point A is a very ineducated one
first, 360 IS UMA...

another ineducated claim is the "aggregate bandwidth" thing...

Oh, the irony...

BlueTsunami · Feb 19, 2006

ROG27 said:
Oh, the irony...

Didn't you know? Pressing "I'm Feeling Lucky" on Googles gives Auto Intelligence +10!

ROG27 · Feb 20, 2006

BlueTsunami said:
Didn't you know? Pressing "I'm Feeling Lucky" on Googles gives Auto Intelligence +10!

Librarians everywhere cry a little on the inside whenever they here the word "Google" being mentioned.

Gholbine · Feb 20, 2006

Question: Does the UMA implemented in the Xbox 360 mean that both CPU and GPU share the ~22GB/s bandwidth the GDDR3 offers? Or do both get their own ~22GB/s access to the memory?

London Geezer · Feb 20, 2006

Gholbine said:
Question: Does the UMA implemented in the Xbox 360 mean that both CPU and GPU share the ~22GB/s bandwidth the GDDR3 offers? Or do both get their own ~22GB/s access to the memory?

It's shared.

DeanoC blog PS3 memory management

mckmas8808

[maven]

ihamoitc2005

pipo

Quaz51

Panajev2001a

mckmas8808

Guden Oden

Senior Member

Jawed

Barbarian

flick556

mckmas8808

ROG27

BlueTsunami

I laugh at you! HA HA HA!

ROG27

Gholbine

London Geezer

Similar threads