Papers from graphics hardware 2006

Nice slideshow demonstrating the value of having a high-bandwidth connection between the CPU and GPU because of the need to both contruct complex data structures and use them for rendering. Of particular interest is discussion of a new programming model that embraces both types of hardware. Great post!
 
I thought it was a load of hot air. If he'd bothered to think about what D3D10 does he'd realise he's completely jumped the gun.

Jawed
 
I thought it was a load of hot air. If he'd bothered to think about what D3D10 does he'd realise he's completely jumped the gun.

Jawed

What do you mean precisely? As I watched the slideshow, and he references D3D10 several times ... right?
 
Wow, yeah, ok, now I understand what you mean! :rolleyes: :LOL:

I'd love to hear how D3D10 either avoids the problems that are claimed, or that the problems don't actually exist. Even a link to the relevant discussion would be appreciated.
 
I'd love to hear how D3D10 either avoids the problems that are claimed, or that the problems don't actually exist. Even a link to the relevant discussion would be appreciated.

I'd be interested to hear more detail about what you're saying as well. I thought a lot about D3D10 while working on the talk and I don't see many places where it has an effect on the main arguments, other than the issue of geometry shaders finally making it possible for the GPU to be able to essentially issue commands to itself, which it couldn't do before. Otherwise I don't see a lot of impact from D3D10.

Anyway, I'd be interested in hearing more details about what you're getting at.

thanks,
-matt
 
Streamout and render to vertex buffer (R2VB). Search the 3D Technology & Hardware forum. Also search for D3D10 in the same place.

http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/Direct3D10_web.pdf

Jawed

Yes, I'm quite familiar with what D3D10 is and what those features are.

Stream out from GS is great for things like building data structures of indeterminate length, though you basically need to output the data in order. In contrast, being able to do random writes to local store on a SPU in the process of building a data structure or random writes from a CPU makes a lot more things possible.

As a simple example, can you outline how you'd propose to build an adaptive data structure like a quadtree from a GS? Or a hash table with linked lists?

thanks,
-matt
 
If you're going to make the case that the current programming model for GPUs does not suit the future co-operation of CPUs+GPUs then you first have to show why D3D10 isn't the man for the job. Why the hell should I write the presentation for him?

And our newly resident D3D10 expert (who prolly wrote the presentation, sigh, but hides behind initials) also seems incapable of explaining why it's no use, either. Completely ignoring R2VB which effectively supports random writes, and effectively ignoring the ability for D3D10 to support writes to multiple streams, not just one.

Jawed
 
Matt, my apologies, I didn't see your first posting in this thread where you explained that you gave the talk :oops: You clearly weren't hiding your identity. Sorry.

I still contend that streamout and R2VB (or just plain writing to a render target from pixel shaders in the usual GPGPU style) will cover the gamut of the data structures required to implement graphics algorithms. You are, after all, trying to build data structures that are meaningful to the GPU itself.

Jawed
 
What i find intressting of it is that partly this describes exactly what is already done e.g. in CoD3 (PS3 Version) where the CPU shall be used (according to devs) for much better lighting and shadow mapping by applying multiple lights via CELL. That seems to be a first implementation of that closer CPU-GPU relation ship offered by next-gen consoles (esp. PS3).
 
Isn't the paper rather low-balling the current PC GPU-to-graphics-memory speeds? Even at the beginning of 2006 it was better than 30GB/s. Not to mention what it will be in about three weeks.

Edit: Oooooh. *That* Matt Pharr.

Heya, Matt; kinda startling to see the editor of GPU Gems 2 staking out that kind of ground. At least it was to me. :smile:

But welcome to Beyond3D, and feel free to smack Jawed around some more. ;) Gentlemanly so, of course!
 
This is too rich

Did Jawed just call out a quite intricate paper by Matt Pharr a lot of hot air.

:LOL: I love this forum, I swear.

Oh, and welcome on board, Matt.
 
Isn't the paper rather low-balling the current PC GPU-to-graphics-memory speeds? Even at the beginning of 2006 it was better than 30GB/s. Not to mention what it will be in about three weeks.

Slide #27 or so quotes the GPU-to-graphics-memory number as currently around 30GB/s--the low 1GB/s number is the bandwidth people are seeing from GPU to main memory (and back). (And while PCI-E promises the potential of 4GB/s there, no one has seen anything near that in practice so far...) So it's that big bandwidth shortcoming that prevents the CPU from doing much other than just blindly sending stuff to the GPU on the PC today...

But welcome to Beyond3D, and feel free to smack Jawed around some more. ;) Gentlemanly so, of course!

Long time listener, first time caller. :D Always happy to have an online discussion, especially when it starts with me being called full of hot air. :D

-matt
 
Hey, welcome to the board Matt Pharr. Don't take the negative comments too seriously, the crowds become wild when you insult their favorite coprocessor! ;)

Anyway, I think you're spot on, but that the presentation is severly wrong in terms of timeframes, although perhaps it's just I'm not getting the right impression on when you think this is going to become important. Simply put, in the x86 space, I'm skeptical we will see this sudden zomg increase in FLOPS

And secondly, I fail to see many algorithms using complex data structures that might become highly relevant to actual GPU rendering, and not just GPGPU (where, of course, more CPU flops and better interconnects would have huge advantages) - but then again, you're in a much better position to know about those than we are! :)

You mention quadtrees for shadowing for example. I'm a tad skeptical you're going to get any decent advantage from those compared to, say, cascaded shadowing techniques. You might be thinking of something else than I am, of course. I am very skeptical we are going to have a good reason to do that kind of stuff for rendering in the next 5 years, but of course, I'd LOVE to be proved wrong by an innovative idea/technique. Is there anything specific you're thinking of and could mention and explain in a tad more detail?

In the world of GPGPU, I agree there are huge possible rewards from a decent CPU you can interact with at good speed. Taking the example of raytracing, for outdoors environments, a basic scheme would use quadtrees for the world and octrees for each model. Now, imagine if you had an animated model and wanted to build an octree for all the triangles in it every frame after the GPU animated it via R2VB... Oops, good luck! ;)

In D3D10, you could use the GS to hack it a bit to a certain extend. But in practice, it doesn't make any difference. It still doesn't allow you to do absolutely everything, and much of the time, if you can do something, the end result might not be too pretty. The resource limitations are very real, and if the rumours are to be believed, the performance will be catastrophic for remotely complex usage scenarios. And even if it wasn't, what you're fundamentally doing is using the GS to program the GPU as a serial processor, guaranteeing not-so-great performance even if the hardware was efficient at it, imo. So, is there some potential there? Sure. Is it even worth seriously thinking about? Not really.

So overall, I think you're right as I said, just that your presentation seems to imply this is going to make sense really soon and nearly would today, when in fact I'm a tad skeptical it does in everything but the PS3, where it makes "OK" sense and has a few interesting possibilities (although some of those still allow you to see the graphics pipeline as a one-way thing; if your goal is proper GPU utilization via smart algorithms, the sky is the limit, definitely). But where things will really begin moving towards that paradigm will be the next-next-gen console systems (PS4/XBoxC/etc.)

What's interesting is that these systems will most likely be based on the desktop D3D11 GPUs imo, if we think about it in terms of necessary timeframes. So there might be a few twists yet to come that nobody has thought of yet, except perhaps the few architects that are already working on these products right now, as we speak and dream. I'd tend to believe we'll be very pleasantly surprised, but it's much too early for me to dare taking a guess, no matter how much I'd like to! :)


Uttar
 
For the future of CELL<->GPU development (aka PS4), - would geometry shaders on the GPU be reduntant compared to a multiplicity of SPUs in a future iteration of CELL?

Just thinking aloud here, as the SPUs seem like very flexible GS'... well, by the time of PS4, GS' would also be more flexible, which is why I see a redundancy there...
 
Back
Top