R500: A GP GPU? So what will it be used for...

Acert93

Artist formerly known as Acert93
Legend
It has been mentioned that the R500/Xenos is a GPGPU, and that it can read and write from main memory. And with a programmable Unified Shader architecture it sounds pretty flexible.

It came out in Fall of 2004 that there was a movement to use NV GPUs for advanced sound processing (example link). So it already seems that modern GPUs can do more than their primary design intends.

As the R500's core technology is bound to find its way into future GPUs (it seems destined for ATI's Longhorn part), it makes me wonder: What will a more advanced, and flexible, GPU mean for the market?

Any suggestions/ideas?


Don't shoot me, but something Ageias said stuck with me. They had mentioned that there was no doubt the ATI and NV could do something similar if they wanted, and even hinted (at least it seemed like a hint to me) that this may very well be the case.

Could a flexible programmable GPU, especially one where the ALUs are unified and general purpose, be used for physics? How well would it perform at this task? Marrying graphics with physics would seem to be a good match at some levels.

From a market standpoint I think it would be a great idea. Instead of a PPU which serves a very limited purpose, imagine a Video Card with 2 GPUs (which, with all the SLI/Crossfire talk these days seems very doable). In games that have heavy physics one of the GPUs could do graphic rendering while the other does physics. In games and apps where heavy physics is not needed both GPUs could focus on graphic processing. This idea sounds like a great use of silocone. But it is possible? Is this a new direction we could see NV and ATI take?

Another possibility is sound. Would GPUs make good sound cards for gaming?


So, any ideas what a technology like R500 could/would be used for outside the normal graphics processing tasks? It would seem the more tasks a GPU does well the more important and vital it would be to the system. And more important parts often can ask for more money and get higher market penetration. I believe the days of spending heavy on a CPU over a GPU are over (I have flip flopped my $400/$200 model in favor of GPUs about 2 years ago). Could the GPU be making more inroads on typical CPU tasks, further making the GPU an even more important part of any computer system?

The possibilities are interesting. So what is everyone elses take?

/Me ducks for mentioning the "bad word" in my post...
 
Theoretically Xenos could be a true vector co-processor and any op that requires plenty of vector processing could potentially benefit from a processor such as Xenos. Exactly how useful it will be is up in the air at the moment since all these things would still have to be structured within the confines of a vertex shader program. At the very least, Xenos or processors like it will probably make for some very nice science experiments and get the GPGPU crowd more than interested, whether it will be adopted in realtime for this type of processing on XBOX 360 is too early to tell.
 
Thanks Dave :D Don't let anyone rib you about your short replies, that was the exact type of explaination and information I was looking for. On the Xbox I would think (could be wrong) that most developers would use the GPU for graphics to make their games graphically appeasing and competitive (there may be a few strays here and there of course), but with this technology coming to the PC I see a lot of potential. Anyone can goof around on a PC, so that is pretty exciting stuff!

Btw, I will be sending you a dry cleaning bill for all the drool on my work shirts from the long anticipated wait for the Xenos article you have been working on ;)
 
Having done a fair amount of GPGPU work myself, I'd guess that Xenos/XB360 will have little impact on that crowd. For most of the problems encountered at the moment, the lack of random write and API overhead are the killers. Assuming the hardware supports random write, these are both API problems, which I don't see them being resolved. Microsoft was very unreceptive of people's attempts last time around. Hacking the xbox to play video is one thing, reverse engineering drivers to open a whole new api is another. Even if people managed to provide such a thing, it'd take a lot of time. And even if it got done in a reasonable timeframe, the fact that it's potentiall violating at least some US laws would mean that no one would risk the effort to try to publish a paper on it.

My guess is that the Cell-power PC IBM is promising in 12 or however many months months will probably appeal to people much more. Since most of the GPGPU crowd is in academia, they aren't really willing to invest the energy to hack a hardware platform just to net what is perceived as a constant time speedup over another method.

As far as GPUs for physics and sound, I'd venture to say the hardware is capable, but very sub-optimal. The problems encountered in each of those are vastly different than graphics when you get down to usage patterns which affect the circuitry and cache layout. I'd think it could do so, but you'll probably won't do either one in this hardware generation.
 
squarewithin said:
And even if it got done in a reasonable timeframe, the fact that it's potentiall violating at least some US laws would mean that no one would risk the effort to try to publish a paper on it.

What if you don't live in the US though?
 
Well the XB360 will have their own API so if they want they will be able to get a very nice GPGPU, Right :?:

From the begining that I thought that they would offload a lot of work (kind of) from the SPUs of Cell ,i.e., they would use a great GPGPU to match Cell, and we already have half.

And even if it got done in a reasonable timeframe, the fact that it's potentiall violating at least some US laws would mean that no one would risk the effort to try to publish a paper on it.

I dont see why MS would not want to see people to push from their tech, if they dont, why?(unless that gave ammo for their rivals)

And in the case of XB360 I even see less why they would not want that, especialy if they do it in a closed circuit from XB devs.(better than those how had XB360 spec).

Thanks
 
Pitty you probably have to fork out $$$$$ to do anything since you will have to buy developement hardware to doing all the coding on as well I'm sure the Xbox360 retail version will only run signed code.

Yay for DRM.
 
Charmaka said:
What if you don't live in the US though?

I dunno, might happen, but I'd bet money not. It's just my feel for what papers are considered valid research. It's enough dicey stuff that the people might just not want to deal with it.
 
pc999 said:
Well the XB360 will have their own API so if they want they will be able to get a very nice GPGPU, Right :?:

Depends on whether it's exposed. It might be limitted through the API like it is currently. Like many features on the card, just because it's there doesn't mean that it's necessarily fast or usable.
 
Xenos has a function called "MEMEXPORT" that facilitates random access read/write to memory, which AFAIK is / will be exposed through the XBOX 360 API.
 
DaveBaumann said:
Xenos has a function called "MEMEXPORT" that facilitates random access read/write to memory, which AFAIK is / will be exposed through the XBOX 360 API.


We have the HW (ALUs), random access read/write to (main) memory and the suport from the API...
So, we only need a nice vertex shader program, to get great GPGPU, from Xenus :D . (in these moments I get a violent instinct against MS/ATI, and their no use of Fast14 tech :devilish: :devilish: :devilish: )
 
DaveBaumann said:
Xenos has a function called "MEMEXPORT" that facilitates random access read/write to memory, which AFAIK is / will be exposed through the XBOX 360 API.

Boy, it would sure be nice to know about this neat innovative technology. I wonder where I could read up on it some more ;)

I am not sure how much the Xbox 360 GPU will be exploited for these features, what excites me is the fact that the Xenos technology will be coming to the PC.

On that end, it was mentioned that it may not be as effective/effecient as a dedicated chip for a task like, say, physics.

But it would be curious to discover how effective it is. If it was 30% as effecient, per transistor, as a dedicated chip I would say that is an excellent trade off. I would take

2 GPUs

over

1 GPU + 1 PPU

The reason being that not all games and applications will have a use for the PPU. 2 GPUs may not be as powerful, but they would be more versatile. e.g. in games with meager physics both GPUs could be used ala SLI/CrossFire. In a workstation, a unified shader architecture would be very effecient at high poly mesh work. In a physics heavy game the 2nd GPU could be used as a Physics Processor. And in other Vector heavy apps (audio?) there may be other tasks it could excell at.

Kind of the Dedicated Chip vs. General Purpose Chip debate. But if GPU makers are considering putting a Physics Processor on some of their video cards, the possibility of a 2nd general purpose GPU that can either further accellerate the graphics OR physics would be a great innovation in my book.

I guess over time, as the architectures of GPUs become more programmable and general purpose, we may see them used in new areas. Of course I am speculating, but it is an interesting development.

Well, time to change another shirt while I wait for Dave's killer Xenos article!

Ps- Dave, is there any special features the MEMEXPORT can be used for in the graphics department? I am not a 3D artist/programmer, but if you could output and save work you have already done and recall it later to be manipulated could that not be a perk in some areas?

I think of some of the big hurdles ahead with lighting, shadowing, particles, etc... and wonder if something like this could help in an area like shadowing.
 
Acert93 said:
Thanks Dave :D Don't let anyone rib you about your short replies. . .

Hey, with love in our hearts. :D

Unless of course R520 has 32 pipes.







;)
 
There isn't that much problem in providing 'random' memory reads and writes as in providing coherent memory reads and writes. If the API doesn't require that what a pipe is writing on its cache shows in another pipe cache there shouldn't be that much hardware problem. Just convert the texture cache into a true read/write cache and provide instructions addressable through linial addresses rather than 'texture coordinates'. I wonder what ATI has implemented and when it will be disclosed (if ever).
 
But there is no guarantee in general, for the order in which such data will be written. Today, the only reliable way is to synchronize on frame end. No algorithm that reads random texel values as they are being computed in the same backbuffer is going to be reliable unless some kind of synchronization primitives are provided (e.g. 'wait for region x/y to complete') of course, those will kill performance.

The other technique of course, is to represent the data not as random access memory, but as a random access fifo, or tuple-space. Have pipelines compute values and stuff them (in any order) into a queue. Then have shaders able to 'take' from this queue any computed value. Of course, if you don't get the value you are looking for in a reasonable time, you're in trouble.
 
Back
Top