Who will get there first?

Panajev2001a said:
Fafalada said:
Pana, clipping doesn't require having vertex shaders on the chip. Although you never know, Sony might insist to remove it just to uphold the tradtiion :?

Well, I know it oes not require, but it would be simplier for nVIDIA to keep the Vertex Pipeline and the CLipping and Culling stages intact as well as the interface these blocks have with the Triangle Set-up and the rest of the GPU.

Engineering wise it seems simplier, :(, to have only Triangle Set-up on the GPU and have the developer take care of clipping in software on the SPUs/APUs, but that would be sad and quite annoying IMHO.

Still, the nVIDIA blocks doing culling and clipping have already been designed and I am sure they can handle very high vertex-rates: it is not like Sony/SCE would have to come-up with those pieces (seeing how they dealt it with the PSP... "only front-plane is necessary, let them handle the rest"... what is this ? They add only 1 clipping plane per GPU generation ?!?).

FWIW as far as I know NVidia is the only IHV to clip entirely post transform (post divide by W). This includes the near plane clip, I don't know how they do this mathematically without backing out the divide (even then how they deal with the near plane degeneracy), but that's what they do and it obviously works.

On NV2A the only thing you pass out of the vertex shader is a screen space position.

So clearly on a PS3 type architecture you could have the CPU "vertex shade" and the GPU could still deal with the rest.

The GPU could also just take homogenous coordinates and clip those for what it matters.
 
How do... How do they clip in screen space without having near-plane clipping issues ?

I understand doing NDCS/Homogeneous space clipping, but handling near plane clipping in screen space seems a bit strange.

What do they do in the Vertex Shader to allow this scheme to work ? Anything special ?
 
ERP said:
The GPU could also just take homogenous coordinates and clip those for what it matters.
Or not really need to clip them at all :D

Anyway, I can't wrap my head around the screenspace clipping at all, are you sure that's what you pass to the clipper? :oops:
 
Fafalada said:
ERP said:
The GPU could also just take homogenous coordinates and clip those for what it matters.
Or not really need to clip them at all :D

Anyway, I can't wrap my head around the screenspace clipping at all, are you sure that's what you pass to the clipper? :oops:

There are 3 additional instructions in all NV2A shaders that divide by W and scale to the viewport. The only thing that goes in the opos register is the result of this, you don't pass anything pre divide in.

They do still have the W coordinate, so they could back out the transform, except that the near plane degeneracy seems to preclude this.

I've never been able to see how they make this work either.
 
ERP said:
There are 3 additional instructions in all NV2A shaders that divide by W and scale to the viewport. The only thing that goes in the opos register is the result of this, you don't pass anything pre divide in.

They do still have the W coordinate, so they could back out the transform, except that the near plane degeneracy seems to preclude this.

I've never been able to see how they make this work either.

if their w-division allows for the preservation of the quotient (vs getting a measly overflow) at 0 divisor, they could simply mark all those out-of-the-VS vertices that need clipping and implicitly inverse-project them to clipping space, clip them and then get them again to screen space, as they've got the connectivity information at this stage.

anyway, to confirm or reject any such theory we need to be able to count how many vertices the clipping stage produces out of a given triangle.
 
This debate reminds me somewhat of the old cisc/risc debates of a decade ago.

Each side was convinced one was going to win over the other and become the predominate architecture. Of course what happened was that the two although remaining seperate and distinct to specific market segments, took the best of the others concepts and applied them where required.

Will GPUs be replaced by completely generic processors?
Will GPUs become completely generic processors?

The common denominator here is the goal to be more generic, which is a desirable goal. However, I see things a bit differently. I believe the ideal goal would be to have general purpose processors being co-processors to traditional graphics pipelines (reverse to CPU - FPU / SIMD) model.

There are very specific areas of graphics pipelines that general purpose functionality is required or even desirable. In other areas massive, fixed or configurable functionality is desirable.

Maybe if the performance ratio between fixed function and programmable units becomes significantly small then this may change, however at that stage it may be worth keep fixed functionality in anyway.
 
DaveBaumann said:
Bottom line is unless GPUs are general enough to sustain an OS, then they're just classed as another co-processor. Just like a programmble DSP co-processor that can be classed as a media engine/processer or a sound processor etc. and they're all converging to CPU like functions as GPUs are.

Obviously the this takes into account of performance. CPU's were traditionally used to do the vertex processing, but quickly become outclassed by onboard graphics processing that might shift back more towards the newer CPU's soon (this year), but they are still not tuned for fragment processing to any degree of performance close to a graphics processor.

AND

SiBoy said:
....
If anything, it reinforces the split between the CPU-like part (PU) and GPU-like part (SPU's). The GPU-like part (stream processor) currently isn't up to the task of anything more than perhaps the VS functionality, so enter Nvidia to fill in the rest.

This just re-instates the importance of complementary processors. For consumer graphics intensive PCs/ Consoles/HW in general, we've reached the high integration of TWO major IC's, a CPU and a GPU or a 'main' processor and a 'co-processor' setup.

For Xenon, a 'CPU' and a 'stream processor, R500'.

For PS3 a 'stream processor, CELL' and another 'stream processor, NV5x'.

This has evolved from the early days of main processor = CPU and co-processor = FPU. Graphics was just a simple raster. Through silicon integration, SoC, and where the transistors are needed for 3D, we've reached where we are today.

This will continue to evolve in a PAIR of IC's in that sense but the balance will shift. The term general and specialised may mean different things but always complementary.

In the future we may see Main proc = Stream processor, Co-proc = Physics/ RayTrace Processor. Or Main proc = Low latency processor and Co-proc= high latecy processor etc. Or we may come full circle and have main proc = general CPU and co-proc = fully hard-wired processor! :p
 
Back
Top