What is PPP?

Tahir2

Veteran
Supporter
What is per primitive processor?
What is a primitive wrt to graphics acceleration?
Can someone explain in laymans terms?
Thanks in advance.
 
I don't know what per primitive processor would be, but PPP could also mean stuff like: point to point protocol or programmable primitive processor which IMO you're interested in?
 
"Programmable primitive processor" sounds about right.

I believe Democoder has some information on this aspect of graphics acceleration... was it not thought to have been in the NV3x design before the GFFX was released?

Edit: typo
 
You can break up the frequently of computation into several groups:

1) per pixel (pixel shaders)
2) per vertex (vertex shaders)
3) per object (primitive shader)
4) per scene (CPU, constants, etc)

PPP targets #3.

It can be used for programmable tesselation, physics, and a bunch of other stuff.

No one knows if it was an NVidia feature. It was described in a academic paper by a guy who now works at NVidia. However, NVidia employs many people who publish interesting papers (e.g. Ned Grene) and not all of their designs make it into NVidia products.
 
DemoCoder said:
You can break up the frequently of computation into several groups:

1) per pixel (pixel shaders)
2) per vertex (vertex shaders)
3) per object (primitive shader)
4) per scene (CPU, constants, etc)

PPP targets #3.
I think that PPP would be more of a per-surface approach, which would be somewhere between per-object and per-vertex. The actual definition of the surface, of course, would be arbitrary, so I suppose one surface could encompass an entire object, but I would find it more common to have multiple surfaces per object.
 
Chalnoth said:
I think that PPP would be more of a per-surface approach, which would be somewhere between per-object and per-vertex. The actual definition of the surface, of course, would be arbitrary, so I suppose one surface could encompass an entire object, but I would find it more common to have multiple surfaces per object.

Never thought of it this way, it's intersting. Yet, wouldn't you just consider each surface intrinsically tied to an object by definition? I don't know, and you allude to as much, but I'm just curious.

Ohh, I think there are two "Es" in his name.
 
Perhaps "per triangle strip/list" or "per HOS" or "per buffer"

I would expect a PPP to be a far more general purpose unit: e.g. random access input/output. I think of it as moving a mini-CPU to the GPU that is handed a vertex buffer as input and produces a vertex buffer as output.
 
A PPP was in the original NV30 designs AFAIK, although it was relatively scrapped I believe.

Both NV40 and R400 have PPPs, but the R420 does not have one.


Uttar
 
Any references to back it up? At 120M transistors, I don't see much room to fit the PPP, so if it was removed, it must have been removed in the very very early stages, such as "oh, it was on the wishlist, but won't fit"

Unless of course, the PPP is already there, but non-functional on HW.
 
DemoCoder said:
Perhaps "per triangle strip/list" or "per HOS" or "per buffer"

I would expect a PPP to be a far more general purpose unit: e.g. random access input/output. I think of it as moving a mini-CPU to the GPU that is handed a vertex buffer as input and produces a vertex buffer as output.
I really doubt a PPP would be that generalized.

I would rather expect that a PPP would effectively have the same functions available to it as are seen in VS 3.0, but with the added flexibility of operating on a per-surface basis, with special hardware for tessellation.
 
If the PPP cannot create or delete geometry than it is effectively just VS4.0, a vertex shader which perhaps can examine more than one vertex at a time.

This is not the PPP that was described in the real time ray tracing paper. To really be effective, if you're going to allow access to multiple vertices, you need the ability to build data structures, especially in the context of what the PPP was envisioned for -- accelerating raytracing.
 
actually, for raytracers, you don't need that..

all you need is the intersection algorithm (wich can depend on some static data)..

and the intersection shader has to be capable enough of implementing such intersection algorithms.
 
A PPP would indeed have to be able to create geometry, otherwise I wouldn't call it a PPP. On an API level I imagine it would look quite similar to a vertex shader, but with a few extra elements. Perhaps like this for 1 level nPatches:

Code:
struct iVertex {
   float3 position;
   ...
   // bla bla  ...
};

void main(iVertex i0, iVertex i1, iVertex i2){
   iVertex vertices[6];
   ... 

   // do nPatches from the input triangle
   ...

   CreateTriangle(vertices[0], vertices[1], vertices[2]);
   CreateTriangle(vertices[1], vertices[3], vertices[4]);
   CreateTriangle(vertices[1], vertices[2], vertices[4]);
   CreateTriangle(vertices[2], vertices[3], vertices[4]);
}
 
davepermen said:
actually, for raytracers, you don't need that..

all you need is the intersection algorithm (wich can depend on some static data)..

and the intersection shader has to be capable enough of implementing such intersection algorithms.


You're forgetting about building data structures to accelerate ray/voxel hits . Unless your entire scene is completely static, this needs to be updated every frame. Also, traversing voxels requires multiple passes today, not exactly efficient.

There's a difference between not needing something because otherwise it would be nigh impossible to implement, and needing something because it runs unacceptably otherwise.
 
Primitive Processor
• Converts graphics primitives into pixels
• 15 interpolators
• 8 dividers– Perspective correct renderingInterpolator
Correction Stored to FIFO
Red, Green, Blue perspective yes,
8 bitsTransparencyperspective yes,
8 bitsA Texture U, Vperspective yes,
12 bitsB Texture U, Vperspective yes,
12 bitsZ Depthlinearyes, 24 bits
Perspective PlinearnoX,
Y Coordinates linearyes,
22+4 bits total
3 Edge Functions linear no

This is an early example of a primitive processor which converts primitives into pixels. It is on the Pyramid 3D which was actually finished.

Now, what do they mean by turning a Primitive into a Pixel? Sounds like a vertex processor but aludes to a greater degree of manipulation. Is this correct?

http://www.hotchips.org/archive/hc9/hc9pres_pdf/ hc97_10c_eerola_2up.pdf

Is this even remotely similar to what is meant by a primitive processor nowadays?
 
Uttar said:
A PPP was in the original NV30 designs AFAIK, although it was relatively scrapped I believe.

Both NV40 and R400 have PPPs, but the R420 does not have one.


Uttar

Irrelevant on which featurelist it is included, it's still somewhat transistor overkill nowadays, since they're basically restricted to OGL for the time being for full functionality. I honestly doubt that M$ is going to do anything about it prior to DX-Next.
 
Humus said:
A PPP would indeed have to be able to create geometry, otherwise I wouldn't call it a PPP. On an API level I imagine it would look quite similar to a vertex shader, but with a few extra elements. Perhaps like this for 1 level nPatches:
But the PPP doesn't have to have anything terribly different from vertex shaders (or pixel shaders: vs and ps are very similar now, and are getting more similar) to create geometry.

Here's the paradigm I would choose for processing:

1. Have an analytic algorithm for generating arbitrary vertex positions (could be a height map, or the equation for a cylinder, or whatever), as well as all other vertex data, such as texture coordinates. This would be the primitive shader, and would be executed once for each generated vertex.

2. Have a tesellation algorithm that states how the surface is to be divided up into triangles.

Here a "surface" would be an arbitrary 3D object that could use as input a certain maximum number of values (similar to how vertex shaders have specified maximum input values). The input values would have definitions entirely selected by the programmer, but from them, the shader must produce the proper values for each shader.

Alternately, the PPP could be nothing more than a triangle generator, with the VS chosen to do all position processing and interpolation.

Update:
Actually, now that I think about it, it only makes sense to have the PPP act as a simple geometry generator. It would operate best if it calculated geometry positions based upon a very simple, known function. Or, perhaps, based upon a limited set of functions that the developer chooses. Final vertex positioning could be done in the vertex shader quite easily.
 
Tahir said:
Is this even remotely similar to what is meant by a primitive processor nowadays?
No, this is what's usually called triangle setup and rasterizer.

One thing I wonder is what would be the best place in the pipeline for a PPP. Depending on what you actually want to do, it might be better to either put it before or after the vertex shader (but always before the clipping stage, of course). E.g. if you want some distance-based tessellation, you either need a PPP after the VS, or do the transformation in the PPP and pass view-space transformed vertices on to the VS, which could interfere with some effects in the VS.
 
DemoCoder said:
davepermen said:
actually, for raytracers, you don't need that..

all you need is the intersection algorithm (wich can depend on some static data)..

and the intersection shader has to be capable enough of implementing such intersection algorithms.


You're forgetting about building data structures to accelerate ray/voxel hits . Unless your entire scene is completely static, this needs to be updated every frame. Also, traversing voxels requires multiple passes today, not exactly efficient.

There's a difference between not needing something because otherwise it would be nigh impossible to implement, and needing something because it runs unacceptably otherwise.

sure. i just had no need for it, and don't see how my raytracer stuff will need it.

well, then again, yes, a PPP could get abused for per-frame-updates.. :D
 
With a movement towards specifying things to the GPU in batches of associated triangles (triangle strips), as mentioned, isn't creation actually the only thing missing? More specifically, creation of new data with an efficient awareness of other vertex data? If you create a vertex before the vertex shader, isn't that pretty much where the host (software/CPU) is now?

What I'm wondering is where the dramatic benefit from a GPU handling this would come from, and it all seems to center around a fast methodology for creation with performance efficiency maintained (to me, so far, and I understand the random access and discussion about what a "primitive" is to reflect something similar).

I had the thought that pixel shaders with the ability to be fed into vertex shaders could handle creation in some way, but it would take some significant latency hiding to do that in a useful way for this, AFAICS, effectively re-solving the problem solved by triangle strips in the first place. This doesn't seem at all a minor task.

Looking at the VS 3.0 specification, I wonder at the relevance of the indexing changes, the changes to vertex decleration, and vertex streaming frequency, as being able to work with vertex textures to provide such a solution when implemented in hardware. That is, if this view of the problem is accurate at all, and isn't without a fatal flaw for execution.

Reference I'm a bit too rushed to consider properly, but might facilitate at least explaining any error on my part.
 
Back
Top