View Full Version : An Anniversary approaches, but will there be flowers and candy?
Which anniversary you ask, and why does it deserve a thread on B3D, let alone GPGPU?
Why, the Anniversary of the Nvidia GeForce G80 gpu, of course. In six weeks, it will be one year old (on November 8th, to be precise).
Personally, I gave up a Radeon X1900XT the week G80 was released to go with Nvidia's new beast. Generally, I've not been disappointed with that decision, with one fairly major exception. And, yes, now we're finally to the GPGPU part, to wit:
Where the heck is our Folding@Home capable Nvidia DX9 drivers? F@H, aside from being meant to help humanity cure various diseases, happens to be --by far-- the premiere and widely available GPGPU-capable application around. Why hasn't Nvidia done what it takes to get this working? How much longer are we going to have to wait? Have we reached a point where it is becoming a valid question to ask if Nvidia may NEVER provide drivers (or client, their choice) that are Folding@Home capable for the G8x series?
Personally, I think at 46 weeks and counting I've been pretty patient on this question. My patience is starting to wear thin, however.
ShaidarHaran
29-Sep-2007, 19:59
I don't think G80 will ever fold, unless a 3rd party takes interest and develops drivers/modifies the FAH GPU client to do so. My understanding of the matter is that X19x0 is still a superior folder anyway, due to greater general GFLOP throughput in the shaders.
Would be nice to get HD2k folding as well, but no such luck so far.
The GPU client hasn't received any attention since intro because the man who was entirely in charge of it is no longer with Pande Group. Fortunately they do now have a team working on improving the GPU client, moving some of the improvements made to the PS3 client back to the GPU client.
silent_guy
29-Sep-2007, 20:19
My understanding of the matter is that X19x0 is still a superior folder anyway, due to greater general GFLOP throughput in the shaders.
For make benefit rest of us readers, would you care to elaborate your understanding about general GFLOP throughput into a more in depth explanation?
Tim Murray
29-Sep-2007, 21:00
What you're seeing is the problem caused by a lack of a cross-platform GPGPU API. If Peakstream had supported compiling to PTX and CTM, you would probably see a G8x, R5x0, and R6x0 client in one binary by this point. But if you're going to go through 3D APIs, it's a pisser to get everything working, much less keep it working with future driver updates. And then you could go through CUDA or CTM, although that limits you further to one vendor.
I wish somebody would write a PTX to CTM compiler so the CUDA toolchain (which is pretty reasonable) could be used with AMD cards (or at least R600, since you need some form of shared memory for PTX). I certainly don't think the new AMD API will be any better on this front, even if it's designed to compile things that can run on multicore CPUs or GPUs (which is a goal I don't even really agree with in the first place).
But if you're going to go through 3D APIs, it's a pisser to get everything working, much less keep it working with future driver updates.
Yeah, how about Nvidia provide ONE DX9 driver that meets the DX9 spec as a start. Or point the finger of shame at the F@H client for breaking the spec. As in the end it's that simple, isn't it --one of them is out of spec. Right or wrong?
I'm beginning to wonder if Nvidia has purposely made a decision to not let G80 work with a DX9 GPGPU path.
Tim Murray
29-Sep-2007, 22:53
Yeah, how about Nvidia provide ONE DX9 driver that meets the DX9 spec as a start. Or point the finger of shame at the F@H client for breaking the spec. As in the end it's that simple, isn't it --one of them is out of spec. Right or wrong?
The spec isn't as clear-cut as you'd like it to be. Remember, nobody has hardware that produces an identical image to refrast, simply because it doesn't matter for D3D9. Shoehorning a GPGPU app into that framework is borderline insane, and I wouldn't be surprised if a lot of the things that the F@H guys have to do to get it to run at all on R580 would cause it to break on G80 and vice-versa. You're basically looking at device-specific codepaths written in D3D9. Then again...
I'm beginning to wonder if Nvidia has purposely made a decision to not let G80 work with a DX9 GPGPU path.
... this wouldn't surprise me in the least.
No, I think Mike Houston would demur that they are after device-specific codepaths. This is the argument for using an industry-standard API in the first place.
I do understand the problems with using a gaming API for GPGPU, and I think long-term we'd all like to see a different industry standard API just for GPGPU. . .but at nearly a year I'm less willing to believe that this result is anything but Nvidia just not wanting to play in this space right now. . . and I think that a major pity.
mhouston
29-Sep-2007, 23:43
We are actually actively working on this. We just don't like to get hopes up too high, so we don't say much unless we have progress.
We have only a few things working reliably on R5XX and that is what is in the wild. We started a new push to get more science cores up, but ran into issues on all hardware. We have stripped the code down to a test harness for easier testing. We have found multiple issues. Some are related to FXC, the Microsoft DX compiler, and some are related to driver compiler wonkiness and other driver related corner cases. We are doing are best to get through things, but as you can see it's a long process. I can tell you that ATI boards and Nvidia boards for the newer code paths produce different results from each other, and different from reference.
The main goal is to get the GB path which as been working great for the FAH guys on the PS3 ported across and working on the GPU. It's more complex than what is currently shipping.
It would be great if there was an industry standard. There isn't so we use Brook for FAH. However, we can target CTM with Brook since ATI/AMD was nice enough to work with us on a CTM backend. CUDA is a more difficult matter. If CUDA ran on multiple vendors as well as multi-core, it would be more attractive for development. Brook is not perfect either, but we have had success in the past and at least the basic stuff runs on all vendors, including painfully slowly on Intel IGP chips. (No, we will not release a client for Intel IGPs!).
I should also add that it's just been recently when we have had simple enough tests to hand off to Nvidia and say WTF? We tried this in the past, but it was with the full client, which is massive, and we weren't able to get very far. With ATI, we got things working there, so it was a matter of making sure things didn't break and getting ATI to tune their compiler for performance a little more. There is nothing that is vendor specific, but it's hard to start with a large code base and trace through multiple levels of bugs.
Thanks for the report, Mike. Regardless of what might be said of the past one way or the other, are you reasonably satisfied with the cooperation you're getting currently?
mhouston
30-Sep-2007, 01:55
One vendor has been easier to deal with than the other currently. Once we can get through making sure it's not fxc or Brook, we will reengage with Nvidia. To be fair, our success with working with ATI/AMD is likely because I knew people there well and was able to bypass a lot of red tape and also get people with influence behind the project. AMD has spent a lot of engineer time on helping with FAH and Brook. At Nvidia, it's hard to wrestle people interested in GPGPU away from wanting everything in CUDA. There is obvious interest at Nvidia in GPGPU since they have done CUDA, teach courses about CUDA, and have some molecular dynamics work with the NAMD guys. It's just getting support for DX9 currently or any support for Brook that is the problem.
Tim Murray
30-Sep-2007, 08:27
Any chance of a Brook to PTX compiler? The PTX spec has gotten a submarine release, it seems, because it's up on the ECE 498 page at UIUC (http://courses.ece.uiuc.edu/ece498/al1/mps/PTX_ISA_1.0.pdf).
Silent_Buddha
30-Sep-2007, 16:18
Any chance of a Brook to PTX compiler? The PTX spec has gotten a submarine release, it seems, because it's up on the ECE 498 page at UIUC (http://courses.ece.uiuc.edu/ece498/al1/mps/PTX_ISA_1.0.pdf).
I think that's more up to Nvidia than the FAH guys. From my understanding the reason that you can work with CTM through Brook is that ATI put in a not insignificant amount of work into helping to get it to work.
I'm not sure if Nvidia has the same dedication to Brook at this time.
Of course, I could be way off base, as I'm just a casual observer here.
It'd be great if both vendors chips worked with FAH. It'd certainly be interesting to see how the architectures compare with a more scientific (realworld) workload.
Regads,
SB
mhouston
30-Sep-2007, 18:17
PTX is just their psuedo assembly, we would still need to be able to compile from Brook to PTX and then we still have to go through the driver compiler as well, although from PTX that seems to be stable. On top of that, we would also need access to the runtime and scheduling layers that handle data movement on and off the board, the interpolants, and schedulers for running blocks and warps. The UIUC folks are doing some custom compiler work, but I don't believe they have any plans to look at Brook. Nvidia actually has an easier to use scatter engine than ATI has currently, so we could also look at exploring scatter in Brook. It's an easy grammar change, but gets a little tricky to support in general since DirectX and OpenGL don't support it, so it would have to be through the vendors GPGPU layers.
vBulletin® v3.8.6, Copyright ©2000-2013, Jelsoft Enterprises Ltd.