what slide?!Jawed said:This slide got missed in the [H] article:
I measure about 18% of the frametime where the vertex load exceeds the pixel load. That long section at the end is about 10%. If that got 5x faster in a unified GPU, then that's an easy ~8% performance gain.
Jawed
geo said:How's that Vista GUI benchmark coming along, Demirug?
Or maybe we don't comment on unannouced product as ATi used to say. But I do think in another way round that maybe their R6xx might get along well with your architecture . (I mean similarly in some degree on your unified proposed).RoOoBo said:More likely, and after reading the whole presentation, it looks like someone from marketing (or similar) had my slides at hand and using them was way faster than asking one of their architecture teams to provide their own graphics.
ehhhhhmmmmmm ?!Guennardi takes over the keyboard and mouse from Will and first tempers my expectations by reminding me that what I will be shown is actually running on current-generation hardware. "It's an X1900. What we've done is take the pixel shader unit and run everything in a Direct3D 10-like fashion on it – vertex shader, geometry shader, pixel shader."
"Essentially, you're emulating the Unified Shader Architecture on just the pixel shader?" I ask.
"Exactly."
Jawed said:In other words you use VS->PS->VB (R2VB) to generate the final triangles (perform geometry shading). You then re-submit the newly generated triangles, VB->VS->PS, to render the final result.
|Setup| |Geom Shad | |Save Results| |dummy| |Vertx Shad | |Save Results| |dummy| |Pixel Shad | | Output |
| VS | ---->| PS | ---->| VB | ---->| VS | ---->| PS | ---->| VB | ---->| VS | ---->| PS | ---->|Frame Buffer|
Jawed said:The first "dummy VS" actually organises the creation of the VB, as I understand it. Since a PS can only render to memory locations determined by the triangle (or quad) it's shading, the initial VS effectively provides the PS with the domain of inputs. e.g. if you want to start with 100 vertices before performing R2VB, then you prolly need something like a 10x10 quad as the starting geometry (or a 100x1 quad). Humus can prolly explain this in English much better...
Once you've generated the VB, you need to do backface culling, viewport clipping etc., all of which are dedicated hardware units in the vertex processing section of the GPU. You could do these in software in the PS, I suppose, but it seems wasteful to ignore the functionality that's already there.
Jawed