texture fetch and shader math execution time in special cases

Testing RCP is kinda the same issue. How do you write a dependent RCP shader, with just R in which you can't just simplify the shader to #RCP%2 number of RCP's?

I can't :cry:
Since RCP is a scaler instruction, I think one may use more input slot to get the shader long enough, then guess the 'real' issue rate.
 
simple answer: it's not possible to reliably measure certain architectural aspects without having a very low level access to the hw.
In some cases one can determine guidelines, but no one can't really pretend to know what the hardware is doing for sure.
 
The main point of GPUBench is to inform you of a mental model of performance aspects of your code for writing GPGPU applications. Since those applications use the same paths that GPUBench does, I'd argue it's a fair benchmark suite for that application domain, and others that use similar paths. It's been really helpful at designing GPGPU applications for both ATI and Nvidia and to identify hardware oddities which lead to lenghty discussions with architects and driver folks at both companies. ;-)

But yes, some thing are *very* difficult to measure. Even readback rates are extremely chipset dependent, and driver revisions will change results. Lower-level access is a great thing for the type of work I do.
 
Last edited by a moderator:
The main point of GPUBench is to inform you of a mental model of performance aspects of your code for writing GPGPU applications. Since those applications use the same paths that GPUBench does, I'd argue it's a fair benchmark suite for that application domain, and others that use similar paths. It's been really helpful at designing GPGPU applications for both ATI and Nvidia and to identify hardware oddities which lead to lenghty discussions with architects and driver folks at both companies. ;-)
I'm not disputing that, I'm disputing some data extrapolated using that suite, I wish I could elaborate on this..:)
 
Well when you can chat about this, we can take this off line. PM me. We'd actually like to hear places where you think the results were misleading and how you came to your conclusions. For much of our work, the model has been dead on for a place to start optimiztions and algorithm design and also helped to explain fundamental differences between ATI and Nvidia hardware. That being said, GPUBench is really setup for GPGPU stuff.
 
When it expires or lifted I meant. Or, you can send via Nvidia or ATI dev rel, depending on who you are working with obviously ;-), around to us Stanford. We know people both groups well.
 
When it expires or lifted I meant. Or, you can send via Nvidia or ATI dev rel, depending on who you are working with obviously ;-), around to us Stanford. We know people both groups well.

The problem is tha his NDA never expires.
PS3 devs have considerably more detail on NV7x than any PC dev.

These benchmarks are useful for what they purport to do, but without knowing what the internal restrictions of the architectures are, you're mapping the results on to faulty assumptions.

I've yet to see anyone make correct assumptions about how even texture reads work, but the common misconception of bilinear 1 cycle and trilinear 2, is accurate enough for most planning.

My experience is that the actual hardware constraints are more complex and more interelated than most of the assumptioons I see people making.
 
Agreed. Although, for many applications the benchmarks have been very useful to us. But also remember that GPGPU does no texture filtering, or MSAA, etc. We actually use an extremely small subset of the hardware functionality. I think we understand very well for GPGPU how the memory system is working, and much of it based on the benchmarks and then converstations with the vendors. Obviously you can mis-interpret a graph, the test may not be doing what you think, etc.

In the end, there is nothing as good for benchmarking as the app you want to actually run. :devilish:

(And as side note, threads that really spark good conversation is one of the reasons I really like Beyond3D. And there are people here who do this stuff for real (research or profit))
 
... And there are people here who do this stuff for real (research or profit))

Sure, never doubted this (in case this was pointed at me ;-)). I am doing research, but on a slightly higher level. But the low level stuff is fairly interesting to me as well and might be relevant for me in the future... The reason why I started the thread was that I observed some weird behaviour in the technique I developed and I wanted to find an explanation. So thanks everybody contributing - even if I understand only half of it. lol :LOL:
 
mhouston, perhaps you should get in touch with Eric Lengyel

Eric Lengyel on gamedev.net forums said:
A while ago, I did an obscene amount of reverse engineering of the NV4x driver.

[snip]

I was able to learn enough through reverse engineering that I wrote a low-level RSX driver for the PS3 which is now in use by Sony first-party game studios.

The original thread: http://www.gamedev.net/community/forums/topic.asp?topic_id=408512

Perhaps someone can invite him to join in this discussion, wouldn't mind more details :)
 
Back
Top