My opinion on programming for a PPP.
A 'static' PPP (number of output triangles is constant) would be fairly simple to write shaders for. You'd just end up filling up an array that is the output. This design of course would likely have a 'caps' value that would specify maximum number of triangles that can be output.
A 'dynamic' PPP (number of output triangles can vary for each input triangle) would likely require special functions/instructions that would output a single triangle/vertex at a time, but can be called multiple times, in a loop for example. This of course is so it would stall if the buffer between the PPP and VS has become full. I'm thinking here something along the lines of the OpenGL immediate mode glBegin(...); glEnd();