SCE patent: PSP very fast multi-pass ? VERY INTERESTING HSR

Gubbi said:
No the multipass mechanism referred to seems to be related to occlusion culling only.

Cheers
Gubbi

It is also related to all the situations multiple rendering passes are needed: you still save T&L time and main RAM bandwidth.

This solution is not better than solutions with single pass multi-texturing in multi-texturing, but it is still better than those when doing multiple-rendering passes as geometry is not re-calculated and re-sent for each rendering pass.
 
I just reread the patent. I apparently misunderstood a good chunk of the way it's supposed to work.

It appears that this device functions similar to the PS2: It can resend geometry without transforming vertices again, but geometry still has to be sent through triangle setup (with frustrum culling/clipping and all). A mistake IMHO.

Also there doesn't seem to be support for multi-texturing (or loop-back), another mistake IMHO.

They probably leveraged a good deal of this from the PS2 design.

Seems very much to be a capacity and not a capability device.

MBX is far more promising (again IMHO)

edit: What's this thing about visibility flags for subpixels, does it use coverage masks?

Cheers
Gubbi
 
Gubbi said:
I just reread the patent. I apparently misunderstood a good chunk of the way it's supposed to work.

It appears that this device functions similar to the PS2: It can resend geometry without transforming vertices again, but geometry still has to be sent through triangle setup (with frustrum culling/clipping and all). A mistake IMHO.

Also there doesn't seem to be support for multi-texturing (or loop-back), another mistake IMHO.

They probably leveraged a good deal of this from the PS2 design.

Seems very much to be a capacity and not a capability device.

MBX is far more promising (again IMHO)

Cheers
Gubbi

It is still an improovement over the GS: that it does need to re-do Triangle Set-up, Clipping, etc... was something that was already mentioned in this thread ;)

Still, their current problem with PlayStation 2 was that with multi-texturing and other stuff going on, the EE just cannot really T&L more polygons than the GS can handle, instead I hear from quite agood number of developers that the GS and its Set-up engine are not really that bad of bottleneck: the GS is fast at rendering polygons ( 32x32 is the best size IIRC ).

They also observed that something that made it more difficult to stream textures from main RAM is that multi-texturing and multi-pass rendering increase the vertex traffic between T&L engine and GPU leaving less space for textures.

The P-buffer does exactly what it says: it buffers the primitives to be multi-textured and/or rendered in multiple-rendering passes in a separate memory location inside the GPU so that we can avoid the T&L processor to waste cycles re-transforming and re-sending the same vertices to the GPU over and over ( which also wastes main RAM-to-GPU bandwidth ).

This patent also wants to ease on developers a method to do HSR through a semi early-Z mechanism before actual rendering ( an experiment on SCE's part as that feature was not in the GS ) of the 3D scene ( with textures, etc... ).
 
Panajev2001a said:
This patent also wants to ease on developers a method to do HSR through a semi early-Z mechanism before actual rendering ( an experiment on SCE's part as that feature was not in the GS ) of the 3D scene ( with textures, etc... ).

Yes, but I'm wondering why they didn't provide for looping back pixels. With multipass you not only has to clip tris, but also has to Z-test pixels for each pass.

The primitive cache is neat and all. But it also sets a hard upper limit on how much geometry you can have in a scene before performance degrades (primitive buffer overflows-> either spill to main memory or discard with subsequent re-transformation of vertice data, probably the latter). The on-die primitive buffer also either adds die-space (and power) or takes room from framebuffer/texture storage.

Seems very much like an evolution of the PS2 EE/GS. Will probably make porting easy ;)

edit: I can see these points have already been raised in this thread

Cheers
Gubbi
 
I do not think the P-buffer takes away memory from the VRAM, silicon real-estate is not too much of a problem when you ship in 90 nm and you will be probably seeing a 65 nm die-shrink before you can say the word "CMOS5", give or take a few months ;)

I am more interested in performance draw-backs for the P-buffer.

What about developers streaming 3D data accordingly not to over-fill the P-buffer ?


This is not a TBR architecture and you do not need all the geometry at once.

There is the case in which the visible geometry ( after you delete all the occluded one from the P-buffer in the first two passes [Z-buffer and Test pass] ) might over-fill the P-buffer, but it would be having quite finely ( maybe too finely ) tessellated meshes and a lot of Translucent over-draw.

In that case, wouldn't fill-rate and T&L performance be a problem before the P-buffer is over-filled ?
 
edit: I can see these points have already been raised in this thread

I was only being a bad sarcastic Panajev ;)

Please keep raising the points you want to stress until they are not fully addressed in your opinion and thank you for posting in this thread :)
 
Back
Top