Questions about PS2

Perhaps they were targetting 640 x 448 for the final buffer, like most PS2 games, and came in a little under 680 on the vertical resolution? DC was the only console locked to 640 x 480 buffers (SSAA was done at the tile level).

4 x SSAA was achieved on PS2 on a field rendered final target of 640 x 224, with 16-bit colour iirc.
 
Perhaps they were targetting 640 x 448 for the final buffer, like most PS2 games, and came in a little under 680 on the vertical resolution? DC was the only console locked to 640 x 480 buffers (SSAA was done at the tile level).

4 x SSAA was achieved on PS2 on a field rendered final target of 640 x 224, with 16-bit colour iirc.
In which game there was 4x SSAA on PS2? And by final target you mean frontbuffer?
 
The fail point is getting people to make software for the architecture. It was far easier to go nuts on hardware when the dev teams were 1-20 people and you could strong arm the publishers with lucrative licensing arrangements. That ship sailed long ago.

Its too bad in some ways because I loved the crazy hardware but its just a reminder that software is still king.

Using the PS2 as an example, wouldn't the hardware and software for the architecture be able to scale together? New GPUs don't eschew previous software models. Couldn't a console manufacturer do the same?
 
Using the PS2 as an example, wouldn't the hardware and software for the architecture be able to scale together?

I mean sure but who is going to fit the bill? You have 2 manufacturers of GPUs that dominate triangle rasterization and an industry that uses those 99% exclusively + handful of competitive CPU architectures. Trying to re-invent the wheel today is going to have huge up front costs and probably no ROI. I am not sure what could be done that would cause a major paradigm shift in realtime graphics applications to warrant any kind of major shift away from raster triangles. Maybe some of the more knowledgeable posters could chime in.

As far as consoles go, its almost a mirror of what happened to the arcade coin op industry. You had the heavyweights (Sega, NAMCO, Konami, etc) all trying to one up each other with crazy exotic one-off hardware with equally impressive software in never ending quest to lure eyeballs to the machines till the cost became so high (both in Hardware R&D and software development) that everyone had to cut back significantly.

Its sad, but the consolidation of the semiconductor industry with the consolidation of the major game publishers have everyone on a leash so tight that you cant stray too far from the middle. Commodity hardware and middleware won out and thats just the way it is.
 
What is the difference between full screen pass and not full screen pass?

I'm not clear what you are asking. But, common uses of the word "pass" include:

Draw an object. Draw the same object again in the same place in order to do additional pixel math. Both draws are sometimes referred to as a "pass". As in, 1st pass over the object. 2nd pass over the object.

Draw the frame. Then draw a full-screen quad that copies the frame from one buffer to another. In the process of doing the copy, you can do additional pixel math (ex: color adjustments). Each framebuffer copy is referred to as a full screen pass.
 
Interesting stuff in this thread. I wonder how powerful ps3 would've been if it had something like a GS2 instead of nvidia. Obviously it would've made things even worse for devs, but imagine what the heavy hitters could've done with it. Though simply having eDRAM, better gpu (nvidia 8xxx series or ati) and a unified pool of memory (Xdr?) would've helped a lot.
 
I'm not clear what you are asking. But, common uses of the word "pass" include:

Draw an object. Draw the same object again in the same place in order to do additional pixel math. Both draws are sometimes referred to as a "pass". As in, 1st pass over the object. 2nd pass over the object.

Draw the frame. Then draw a full-screen quad that copies the frame from one buffer to another. In the process of doing the copy, you can do additional pixel math (ex: color adjustments). Each framebuffer copy is referred to as a full screen pass.
Thank you! That explains a lot. Tell me please, when full-screen quad is copied it replace existing back buffer or not? If not how it fits in EDRAM when there's already back buffer, z buffer and textures? Also is place for front buffer always reserved or as it fills EDRAM later there's 4 MB free for other stuff? I mean after rasterisation there's back buffer, z buffer and textures in EDRAM but is that last 1 MB used for something or reserved for front buffer?

Also tell me anyone what exactly is SPU2 and is there in PS2 SPU 1?
 
Thank you! That explains a lot. Tell me please, when full-screen quad is copied it replace existing back buffer or not? If not how it fits in EDRAM when there's already back buffer, z buffer and textures? Also is place for front buffer always reserved or as it fills EDRAM later there's 4 MB free for other stuff? I mean after rasterisation there's back buffer, z buffer and textures in EDRAM but is that last 1 MB used for something or reserved for front buffer?

The EDRAM is just a 2D block of memory that you can pack with framebuffers, textures and palletes in any arrangement you want. There isn't any special reserved location for anything. You just have to keep track of where you put everything and be sure to point to the right place when you try to use it later.

I couldn't find a VRAM dump from a PS2 game. But, here's MGS for the PS1. So, there's no depth buffer, but you can still see the front buffer, back buffer, sprites and palletes.
1445718434644.png

Here's a different game with a different packing: https://www.vg-resource.com/thread-23527.html

You can't read from the framebuffer as a texture and write back to it as a texture at the same time. Well, nothing prevents you... But, it's undefined behavior and in practice you'll get glitches from timing issues in the read and write caches. On the PS2, what I liked to do when post-processing was to stomp over the depth buffer as my temp, working space. It's the same size and shape as a framebuffer and the depth info is no longer needed at that point in the frame. So, I would ping-pong intermediate steps between the framebuffer and the temp buffer.

Also tell me anyone what exactly is SPU2 and is there in PS2 SPU 1?

The "SPU2" in the PS2 was just the "Sound Processing Unit". It was named with a 2 because it was almost literally 2 PlayStation1 Sound Processing Units welded together. Both version 2 and 2 of them. Hardware designer humor...
 
Correct me if I'm wrong. After polygons are calculated on VU1 they are sent to GS (16 kb of data at a time, because VU1 have 16 kb data cache). Then GS makes rasterisation and that data is written to back buffer and to z buffer. After that GS read data from back buffer and z buffer and makes texturing for that part of frame. Then result is again written to back buffer. This repeats till all polygons is calculated, rasterised and textured. I'm I right till this point?
 
Correct me if I'm wrong. After polygons are calculated on VU1 they are sent to GS (16 kb of data at a time, because VU1 have 16 kb data cache). Then GS makes rasterisation and that data is written to back buffer and to z buffer. After that GS read data from back buffer and z buffer and makes texturing for that part of frame. Then result is again written to back buffer. This repeats till all polygons is calculated, rasterised and textured. I'm I right till this point?

Rasterization and texturing happen at the same time, 1 poly at a time. The GS is fast because it is a really simple, dumb device. It draws a triangle with 1 texture, 3 vertex colors and a barely-configurable blending option. Then it draws another one.

One thing that is smart that makes things fast is the DMA processors (vif and gif). They don't get enough attention. The CPU lays out a linear buffer of memory containing the data to send to the VUs wrapped in a header that tells the VIF how and where to put that data in the VU memory. What make this fast is that the CPU can be making a buffer at the same time that the VIF is sending the previous buffer at the same time that the VU is processing the buffer before that. Meanwhile there is GIF for moving data from VU -> GS. So, while the VU is processing that buffer, the GIF is sending the result buffer before that one to the GS and the GS is rasterizing each triangle while receiving the next one.

So all 5 processors (CPU, VIF, VU, GIF, GS) are getting work done simultaneously in a pipelined bucket-brigade. The downside is that the VU needs to split up it's 16k into Incoming from VIF, work area, Next Outgoing through GIF, and Currently Outgoing through GIF regions.
 
So, and if there's need to be done multipass for that polygon, then same polygon goes from VU1 to GS, it again rasterised and textured and written to backbuffer, right?
 
So, and if there's need to be done multipass for that polygon, then same polygon goes from VU1 to GS, it again rasterised and textured and written to backbuffer, right?

Yep. You would probably draw a 1000 poly model by breaking it down into chunks of a few dozen polys that can all fit in VU1 mem. Send one chunk to VU1, have it transform the verts and send them to the GS. You could switch texture and blend mode between each triangle if you really wanted to. The GS could handle that way better than PC video card drivers could. But, going that extreme would still be slow on a GS. Instead what everyone would do is a more traditional "tex0, opaque, tri0, tri1, tri2, tri3... tex1, additive, tri0, tri1, tri2, tri3..." But, again, there would only be a few dozen triangles in VU mem, not a whole scene.

So, no magic. Just basic optimization on very simple hardware.
 
Yep. You would probably draw a 1000 poly model by breaking it down into chunks of a few dozen polys that can all fit in VU1 mem.
But VU1 sens polygons only one at time, right?

But, going that extreme would still be slow on a GS. Instead what everyone would do is a more traditional "tex0, opaque, tri0, tri1, tri2, tri3... tex1, additive, tri0, tri1, tri2, tri3..." But, again, there would only be a few dozen triangles in VU mem, not a whole scene.
This is multipass you talking about here?

On the PS2, what I liked to do when post-processing was to stomp over the depth buffer as my temp, working space. It's the same size and shape as a framebuffer and the depth info is no longer needed at that point in the frame. So, I would ping-pong intermediate steps between the framebuffer and the temp buffer.
So, you saying what when Z Buffer isn't needed anymore, you can use it's place to store temp buffer instead?
Also for what is needed temp buffer?
 
But VU1 sens polygons only one at time, right?

Yep. The VU creates a little buffer full of GIF commands and triggers the GIF. The GIF then executes the commands that say "Stuff this value in that GS register" over and over. Some registers control the blend mode. Some the source texture address. Some the framebuffer address. Some the triangle mode (list, strip, sprite).

The fun one is the vertex position register. You stuff all vertex positions into a single register. It looks like you are overwriting a single value over and over. But, the third time you write to that one register, a triangle gets drawn! If the triangle mode register is set to List, you get a triangle every third time you write to the register. If it is set to Strip, you get a triangle on the 3rd vert and then another for every vert you stuff in the register. I think there was a reserved high bit in the position register that would reset the strip.

That shows the really fun bit of the GS design. It is a processor with no instruction set. It only has registers. The GIF pokes at those registers and the GS responds by doing stuff. But, there's no such thing as GS assembly language. The sound processors work the same way.

This is multipass you talking about here?

Yep. That's all there is to multipass.

So, you saying what when Z Buffer isn't needed anymore, you can use it's place to store temp buffer instead?
Also for what is needed temp buffer?

I'm going to do a full-screen post-process. Let's say I'm going to make the whole screen have a dark red filter in post by multiplying the screen by (192, 128, 128). I need to read the frame buffer, do the multiply using vertex colors and write the result somewhere. I need a place to put the result! I can't just put it back where the framebuffer already is because reading and writing to the same texture simultaneously is unsupported and will cause glitches. So instead, I use the memory that currently holds the depth buffer. It's convenient because it is the right size, it's holding data that won't be needed next frame and it means I don't have to stomp any useful textures or palettes in GS ram to make room.

Even if simultaneous read-write did not cause glitches, there are lots of post effects like blurs that require multiple reads from the source to get the right result. The GS can only read once, write once, read once, write once. It can't do lots of reads before it writes a pixel. So, you need some scratch space to work in for those kinds of effects.
 
That shows the really fun bit of the GS design. It is a processor with no instruction set. It only has registers. The GIF pokes at those registers and the GS responds by doing stuff. But, there's no such thing as GS assembly language. The sound processors work the same way.
Interesting. That means GS isn't programmable? Even N64 had programmable GPU.

So instead, I use the memory that currently holds the depth buffer. It's convenient because it is the right size, it's holding data that won't be needed next frame and it means I don't have to stomp any useful textures or palettes in GS ram to make room.
So, it's possibe to erase some data from EDRAM and use that place for something different?
 
Even if simultaneous read-write did not cause glitches, there are lots of post effects like blurs that require multiple reads from the source to get the right result. The GS can only read once, write once, read once, write once. It can't do lots of reads before it writes a pixel. So, you need some scratch space to work in for those kinds of effects.
This is description of how post-processing works on PS2? I think I've heard something like this about motion blur on PS2.

Also, as I understood you, there's possible to read back buffer, then modify it, then write in to EDRAM with different result. But, as I understood, all work with polygons, multipass, texturing is already done by this point. So how those changes happen? Does GS pixel pipelines do something there?
 
That means GS isn't programmable? Even N64 had programmable GPU.
The N64's RDP doesn't run programs either - its 'programmability' is by register, like the GS.
Meanwhile, the N64's RCP performs programmable transform/lighting functions, like the VU.
(I guess there's a distinction if you care where chip boundaries lie - but it's no different from the system/game perspective.)

I suppose the N64 has programmable triangle/rasterizer setup. But, that was not a useful feature and it killed performance, so later Nintendo consoles use hard-wired setup like the PS1/PS2.

In practice, the N64's graphics pipeline was less programmable than the PS2.
RCP programming was unsupported, difficult, and strongly discouraged by Nintendo.
VU programming is painful too, but less so, and at least Sony tried to help.
 
This is description of how post-processing works on PS2? I think I've heard something like this about motion blur on PS2.

Also, as I understood you, there's possible to read back buffer, then modify it, then write in to EDRAM with different result. But, as I understood, all work with polygons, multipass, texturing is already done by this point. So how those changes happen? Does GS pixel pipelines do something there?
Render textured polygon to a temp location within edram, texture UV is pointing to previous backbuffer. (It's just tex reads and ROP writes.)

If render polygon in a way that target pixel center is in between texture sample locations, you have just averaged 4 samples. (Change location to use different filter weights or use blending mode and you have combined it with whatever you have in target location.)
If it's not enough use couple of locations in edram to pingpong data.

If I have understood correctly things like LUT tables (Palets for paletized textures) are within edram as well and you can write on top of them.
 
Last edited:
Interesting. That means GS isn't programmable? Even N64 had programmable GPU.

You get a few modes describing how to use vertex colors: ignore, tex * vert, and something like tex * vert.rgb + vert.a? I forget how that one worked.

You get a single blend mode that has a few configurable parameters. With it you could do opaque, linear blend, add, subtract, monochrome multiply and maybe some odder options. It could not handle a full-color multiply. That's why PS2 game had to use grayscale lightmaps.

You get some alpha test modes.

That's about it.

You can read all about this stuff here: http://hwdocs.webs.com/ps2

So, it's possibe to erase some data from EDRAM and use that place for something different?

There is no magic. The EDRAM is just a canvas to paint on and copy-paste from. You don't erase anything. You just paint over it. It's like asking "In MS Paint, can you erase part of your painting so that you can use that place for something different?" There is no organization to the pixels in MS Paint. You just put stuff wherever you feel like it and you can make whatever mess you want.

Also, as I understood you, there's possible to read back buffer, then modify it, then write in to EDRAM with different result. But, as I understood, all work with polygons, multipass, texturing is already done by this point. So how those changes happen? Does GS pixel pipelines do something there?

There is no magic pixel pipe mode. There are only triangles and sprites. The GS doesn't have a magic post-processing feature, so you have to make due with the tools it does have. It does have sprites. Sprites have the same texture and blend options as triangles. They're just screen-aligned quads. So, you can make a sprite the size of a whole screen that reads a texture the size of a whole screen. Bam! You made a post-processing mode!

The only magic is that someone figured out the layout of the internal caches in EDRAM. And, by drawing sprites that are a bunch of tall columns instead of one big sprite, you could line up with the caches really well. That way a row of tall sprites would be faster than one huge sprite.
 
Back
Top