Can this SONY patent be the PixelEngine in the PS3's GPU?

That's too conservitve If I wanted to take it easy, I'll say 5-10 bpps and 900-1.5 polys per second. But the more I think about I think it's going to be 10 billion plus.
 
We need to represent 3D meshes in better ways.
IMHO multiresolution representations are the way to go..
We just have to learn how doing that fast on next generation hw

ciao,
Marco
 
Panajev2001a said:
I've said this before I seriously looked at both of these technologies for a racing game a few years ago and it was cheaper to store a very highres model (>25000 poly's) than to store the subdivision surface model or the nurbs model. You have to get really close to a 25000 poly car model for the polygons to become obvious (the model had bulbs in the headlights!).

Please, do expand on this and break some myths along the way.

It seems so strange to me that a NURBS model or a Subdivision surface model can take more memory storage space than a straight 30,000 polygons model.

It is also true that if the NURBS surface uses a tons of control points, you have to store them.

The idea with NURBS and Sub-Division surfaces is to make surfaces smoother and more detailed with less memory storage cost, but I can see how modelling VERY VERY small details of lots of different parts on screen ( you tend to use much more small patches rather than one single big patch ) might take quite a big toll.

Is this what you are saying ?

Thanks for the help ERP :).

The first misconception is control point density, the number of control points required to model anything "real world" is very large. We wrote a Catmull clark surface editor in Maya and had an artist build a car, the control point density was between 1/3 to 1/4 of the number of verts in our polygon the model, however with the subdivision surface we also need to store connectivity info for the mesh. For any sort of reasonable tessalation algorithm this gets very expensive. I don't remember the exact figures off the top of my head, but it was very close between the two, with a slight edge towards polygons. The polygon mesh was also easier to work with and had less obvious problems.

With Nurbs we had two issues, one was that it was difficult to model continuous smooth surfaces, artists apparently usually cheat in CG and just overlap pieces. The second was we needed both very large numbers of control points to model anything with accuracy, and they were utterly useless without trimming. Trimming was just was too expensive to consider.
 
Cheers guys....just been catching up on the thread...

Paul said:
I bet Pana is going absolutely nuts reading this, a sensory overload for him.

If Pana gets sensory overload from the patent, I daren't think what'd happen if he gets a PS3 dev kit! :oops:

Fafalada said:
Jaws wrote:
but could this potentially be the patent relating to the much vaunted PixelEngine in the PS3's GPU aka Visualizer?
Taking a guess based on what I've read so far, this strikes me as an attempt at a fast and compact programmable replacement for conventionally hardwired circuitry (eg. rasterization, filtering, primitive setup etc.).

If I'm right, implications would be interesting - how about a programmable primitive processor? And jvd and co. should jump for joy about idea of having a configurable pixel feature set

That said, I don't think this is a patent related to entire Visualizer, just the blackbox parts previously thought as fixed hw yeah. (Then again that may have something to do with me refusing to accept any idea of having multiple ISAs for geometry and fragment computation parts... In my dream world we're still using APUs for all shading ops ).

Would it really matter if APUs doing all shading ops (vertex and pixel) be replaced by say, vertex shading via APUs and pixel shading via these SALPs in the PixelEngine? Indeed, the GPU maybe without APUs and replaced by these SALPs entirely...:?

mmm... I also see accepting one ISA for APU's and SALPs difficult from a Cell philosophy.:? Maybe the Cell ISA has extensions for graphics?

I'm not clear how software Cells will work on the SALPs. Or is there meant to be a consistent ISA for Cell graphics? I never really got the whole distributed graphics thing with Cell. E.g, an app written for a Cell PDA client work on a Cell PS3 (with a different GPU)? Would the Cell OS use a JIT type compiler to hide this from other types of GPUs on different Cell clients? :?



McFly said:
These days even patents sound like marketing papers:

Quote:
complicated processing flow with an extemporaneous and explosive amount of operations

Can we even talk about pipes in that patent? From my impression there are 256 SALC (serial arithmetic logic circuit?) connected together in 8 blocks (each block with it's own bus), but each of them can be accessed individually. The output happens over 32 serial output lines. Is that right?
'Explosive amount...' :D , that's what I thought! Maybe it'll fry the GPU at 4Ghz!

Reading Fig.6 in the patent, there are 256 SALPs (serial operation pipelines) and each SALP consists of 32 SALCs (serial arithmetic-logic circuits)...can be thought of as 256 pipelines with 32 stages each? A total of 8192 SALCs! :oops: Whether that's per PixelEngine per VS (*4) or the entire GPU, the patent isn't clear. These SALCs do seem to be tiny though, operating usually on 1-3 bits... :idea:
 
ERP said:
We wrote a Catmull clark surface editor in Maya and had an artist build a car, the control point density was between 1/3 to 1/4 of the number of verts in our polygon the model
I agree.
In fact I'd work in another way. I'd let the artists work with the basic primitives they like, then I'd convert that primitive in a rough polygonal (triangles) model, from there one would like to convert this polygonal mesh in a preferred (loddable? compressed? etc..) representation.


however with the subdivision surface we also need to store connectivity info for the mesh.
This can be avoided if one can can limit the min-max valence of a vertex,
and can 'waste' some memory to store a lot of precalculated tables..

With Nurbs we had two issues, one was that it was difficult to model continuous smooth surfaces, artists apparently usually cheat in CG and just overlap pieces. The second was we needed both very large numbers of control points to model anything with accuracy, and they were utterly useless without trimming. Trimming was just was too expensive to consider.
I agree another time..NURBS should die ;)
Subdivision surfaces are better in every aspect I can think of at this moment.
Moreover one don't need to model fine details..but just a smooth surface/domain that can be displaced or bump mapped.

ciao,
Marco
 
Jaws said:
Would it really matter if APUs doing all shading ops (vertex and pixel) be replaced by say, vertex shading via APUs and pixel shading via these SALPs in the PixelEngine? Indeed, the GPU maybe without APUs and replaced by these SALPs entirely...:?
SALPs can effectively replace those parts of a GPU devoted to rasterizing, zbuffer/alpha/stencil tests. Shading is a work for the mighty APUs :)

Maybe the Cell ISA has extensions for graphics?
I hope it hasn't!
If CELL will have some kind of support for graphics (like texture fetching instructions, but I wouldn't bet on it) I hope it will not be via ISA extensions.

These SALCs do seem to be tiny though, operating usually on 1-3 bits... :idea:
That's good for antialiasing or dithering..yeah :)
 
nAo said:
Jaws said:
Would it really matter if APUs doing all shading ops (vertex and pixel) be replaced by say, vertex shading via APUs and pixel shading via these SALPs in the PixelEngine? Indeed, the GPU maybe without APUs and replaced by these SALPs entirely...:?
SALPs can effectively replace those parts of a GPU devoted to rasterizing, zbuffer/alpha/stencil tests. Shading is a work the mighty APUs :)

Uh-oh... so we might have found the building block for the Pixel Engines... thanks to Sony for the fun treasure hunt game :).
 
however with the subdivision surface we also need to store connectivity info for the mesh.
This can be avoided if one can can limit the mix-max valence of a vertex,
and can 'waste' some memory to store a lot of precalculated tables..
[/quote]

Actually Vertex Valence turned out to be much more of an issue than we had anticipated. Before we had the model built, we'd estimated <10% etraordinary verts. The model when finished it was more like 80% extraordinary.

I spent a fair amount of time looking at the model afterwards and there were really no unreasonable constructs in there. In fact the large number of high valence verts was a result of minimizing the patches in the model.

There are a number of compression schemes for geometry that might work, and a lot of things can be streamed with care. Where it starts to become an issue is large expanses of terrian (which for various reasons can't be a height map and can't be procedurally generated) with long viewing distances. I just have to laugh when people propose these ludicrous polygon counts they're expecing to see in next gen titles.

Besides we should be discussing shader instructions/pixel it's a much more interesting benchmark.
 
nAo said:
I agree.
In fact I'd work in another way. I'd let the artists work with the basic primitives they like, then I'd convert that primitive in a rough polygonal (triangles) model, from there one would like to convert this polygonal mesh in a preferred (loddable? compressed? etc..) representation.
Absolutely as poly counts increase the artist need to be taking out of the loop with regard making render friendly art. They have enough problems making it look pretty without having to work around bizarre rules that make perfect sense from a ASM level....

Hoppes has some good stuff with geometry images and displaced subdivision surfaces for this kind of stuff.
 
DeanoC said:
Hoppes has some good stuff with geometry images and displaced subdivision surfaces for this kind of stuff.
Geometry images need more research (there are too many drawbacks at this time, imho)..DSS are very interesting, but there are literally bilions of modes to implement them in a modern GPUs, so there is a lot of research to do even along this way.
Computer graphics is fun :)
 
Yay, nAo also thinks similar ideas of Salps... perhaps I'm only a little crazy... :p

Jaws said:
Would it really matter if APUs doing all shading ops (vertex and pixel) be replaced by say, vertex shading via APUs and pixel shading via these SALPs in the PixelEngine? Indeed, the GPU maybe without APUs and replaced by these SALPs entirely...
As noted, the idea seems to be that SALPs effectively 'are' the core of the pixel engine black box (if you recall the original Cell patent, pixel engine was denoted separate from APU shaders).
Anyway, I second nAo in regards to ISA issues, if Cell breaks on ISA problem as simple as this, it's a bit of a failed ideology from the get go.

nAo said:
I'd let the artists work with the basic primitives they like, then I'd convert that primitive in a rough polygonal (triangles) model, from there one would like to convert this polygonal mesh in a preferred (loddable? compressed? etc..) representation.
Well that's how it's normally done with compressed meshes no? :p The question is how much control can we afford take away from artists, I mean converting to some exotic subdivision scheme for compression may have unpredictable results...
 
Fafalada said:
Well that's how it's normally done with compressed meshes no? :p
I don't know Faf..I'm a newbie as games developer ;)

The question is how much control can we afford take away from artists, I mean converting to some exotic subdivision scheme for compression may have unpredictable results...
Umh..those kind of problems can be avoided. I'd worry about lossy mesh compression..
What about a 3D artist gone crazy about the fine details he/she modelled on a bilions triangle mesh your compression scheme has just smoothed away? :D

ciao,
Marco
 
On the manufacturing side...

With the SALC/SALP I see a very scalable stucture that basically once again uses the strength of repeating over and over identical functional blocks, the same way we achieve strength in CELL: using more and more APUs ( Kahle's patent thought about having 1 more APU per PE than specified to increase redundancy and allow for high yelds [putting 9 APUs makes for higher chances that 8 of them are working in the final chip] ) per PE and more PEs per chip.

I think this could be a nice way to keep manufacturing costs under control even with a big chip's surface.

Debugging costs should be lower: once you have fully designed, synthesized, manufactured and tested an APU or a SALC block ( or a full SALP ) then replicating APUs and SALCs over the chip represents less of an issue than filling the rest of the chips's surface with other custom units ( which you have to separately develop and test ) :).

What do you think ?
 
ERP said:
I spent a fair amount of time looking at the model afterwards and there were really no unreasonable constructs in there. In fact the large number of high valence verts was a result of minimizing the patches in the model.
Were your meshes reparametrized?
Anyway I dont fear non-regular vertices..but I'd want to clamp valence between reasonable values..like 3-9. Higher the max valence, larger the tables..

Besides we should be discussing shader instructions/pixel it's a much more interesting benchmark.
Sure..what about your figures for the next gen? :)
 
nAo said:
What about a 3D artist gone crazy about the fine details he/she modelled on a bilions triangle mesh your compression scheme has just smoothed away?
Well yeah, that's what I meant, converting to another representation is pretty much lossy by definition.
Anyway I guess we can always teach artists to avoid making things that get 'lost' or modified in a bad manner :p (I mean currently used realtime schemes for meshes are mostly just quantization and they still have to take care about precision issues occasionally).

Panajev said:
What do you think ?
I think I still want my programmable primitive processor :p
 
Anyway, I second nAo in regards to ISA issues, if Cell breaks on ISA problem as simple as this, it's a bit of a failed ideology from the get go.

I do not think it would: Software Apulets do not target the Pixel Engine part directly.

I think that the Pixel Engines would be controlled by a different software block, one that could be replaced by just about anything inside.

Think about a room full of x86 based PCs, it does not make one bit of problem waht GPU they use as long as each supports DirectX ( well let's no scrutinize to the atom level, come one give me a little break :p ).

I see all CELL based devices which have some sort of display attached to them or that are designed to directly output to a display to have some sort of graphics API that could run on the APUs if there were no Pixel Engine ( software fall-back ) or it would run on these SALC/SALP based Pixel Engines or other Pixel Engine Hardware which answer to the specifications of the graphics API.
 
Fafalada said:
As noted, the idea seems to be that SALPs effectively 'are' the core of the pixel engine black box (if you recall the original Cell patent, pixel engine was denoted separate from APU shaders).

My current thinking is wondering if/that this Patent could describe a microarchitecture which is capable of supporting the form of rendering described in the SCE Patent nAo posted.

My thinking after that patent centered on how you could feasible impliment something tantamount to it in silicon. Much of the flexibility it showed (and are it's strongest points) are hard to imagine being implimented in an area effecient manner due to recurrent and redundant logic constructs needed. Running it in what Jvd would incorrectly labels "full software' (or something like that which would really be akin to an APU or Shader) would be horribly ineffecient compared with a conventional raster pipeline (area/process constant) due to rasterization intrinsically having a [more] linear bound. Something that was 'tighter' in construction and granularity seemed to be needed; to take a current raster pipeline, break it into logical chunks and maximize the computation and, more importantly, the connectivity between them. That would appear to be what this is.
 
Fafalada said:
I think I still want my programmable primitive processor :p

Lets have a whip round and see if we can get one, sure we can make one with some sticky back plastic and a washing up bottle (Blue Peter reference to confuse non-brits) :LOL:

All we really need is GPU that can read and write system memory at will.... Now I wonder where we could get a GPU that can arbitarly access memory ;)
 
Is it just me or is this just describing a pipeline system?

How I read it...

Each set of 32 serial ALU combines to provide 1 operation per cycle on 4 channel high precision ALU. (FP24 would need 8 3 bit ALU per channel). Less precise data can gets more operations per cycle.

Each one of the 256 units then operate on a single fragment as it passes through the programmed rasterisation steps.

The actual amount of fragments issued per cycle is dependent on the rasterisation complexity. Given that that scissor test would take 8 cycles and depth buffering a minimum of 4 cycles gives you an idea of how amazing complex rasterisation is. If we are nice and reckon at 50 cycles for a fragment to go from the end of the fragment shader to framebuffer (this is far too low, think about stencil, colour and depth operations) We would get 5 actual fragments per cycle.

A modern PC video card has something like 1000 fragments in progress and can output upto 16 per cycle, so this actually fairly un-parellel by graphics standards...
 
Back
Top