Benchmarks - how hard to tell % parallel execution of a GPU?

g__day

Regular
I for one would love a benchmark that could tell you how often and on what workload / shader mix ATi and NVidia are able to run at high parallelisms in the execution units of their GPUs.

Imagine a benchmark that relfects either current or possible future games workloads that told you not only overall min and max fps, but at what load over time the GPU pipelines are operating at.

So for instance in Doom 3 on NV30 it might tell you in timedemo 99 the GPU was averaging at 80% parallel processing units busy indicating its well suited and the Drivers optimisations are great - whereas if ATi only achieved 35% maximum parallel loading of its GPU's internal execution units this would tell you they have either an inherent architecture problem with the game or a driver optimisation issue that may be correctable.

Would such a thing be 1) possible (maybe only by the ATi / NVidia and them maybe only with a chip redesign to monitor loading of GPU units) 2) likely to be created outside ATi or NVidia by an independent trusted advisor?

Really I want to know much more precisely how good or poor shader -> API -> driver optimisation is today and how much potential for future improvement exists.
 
*sigh* Hard to resist the temptation to speak of That Which Reminds Old Men of Their Age, because I think it would be interesting to contrast my discussion of it with your idea. I'm afraid that I'd be caned to death, though, so maybe tomorrow when I'm sure I could outrun the old fogies.

:p

Proxels RULE!!!
 
One thing I learned a long, long time ago: don't assume that because the hardware's busy all the time, everything's coded right. It may be that you programmed the hardware wrong!

To talk about 'how busy the hardware' is you have to largely decompose it into atomic components and analyse the bottlenecks, so it's a complicated job and the information required is all horrendously proprietary...

Really I want to know much more precisely how good or poor shader -> API -> driver optimisation is today and how much potential for future improvement exists.
As a (partial) joke I say 'so would we' and deadly serious I say 'so would our competition'. :)
 
Dio,

I totally agree, its propietary (unless someone can realise you should do this and figure a really clever way fo doing it on the silicon rather than on emulators (although an emulator would work brilliantly).

A busy shader is meanless unless you know the skill of the person writing the game API and driver - but you could presume the driver and API writers are proficient else they wouldn't be hired.

Maybe driver writers should release emulators to allow game developers work out how easily their code optimises to keep their chips busy?
 
I think it's moving away from that model a bit. Ideally, there should be the perfect compiler, which takes some incredibly simple high-level language which anyone can understand and produces the ideal code for implementation on the hardware.

Optimising for individual bits of hardware can be really hard. The drivers should worry about most of it!
 
Back
Top