The one and only Folding @ Home thread

ShaidarHaran

hardware monkey
Veteran
You need some pretty contrived examples for GT200b not to get crushed by Cypress in GPGPU workloads, and if you scaled Cypress to 55nm it would be roughly the same size.

Untrue. See Folding@Home, where GT200b stomps all over anything ATi has, including Hemlock. Math rate isn't the only factor in solving any problem, GPGPU or not.
 
A choice on ATi's part, and an ironic one given the fact that F@H for GPUs began on ATi hardware.

But the fact still stands, therefore it's not a good comparison. And why is this a choice on ATI's part? Why don't the F@H people (are they still at Stanford?) optimize the ATI client?
 
Untrue. See Folding@Home, where GT200b stomps all over anything ATi has, including Hemlock. Math rate isn't the only factor in solving any problem, GPGPU or not.

I think that ATI have the hardware in cypress to compete. Its just the sofware that is lacking
 
But the fact still stands, therefore it's not a good comparison. And why is this a choice on ATI's part? Why don't the F@H people (are they still at Stanford?) optimize the ATI client?

It is a good comparison, because it's a real-world example. ATi chooses not to work with Pande Group to help optimize the F@H client for their hardware. Nvidia does the opposite, as is often the case.
 
I think that ATI have the hardware in cypress to compete. Its just the sofware that is lacking

Perhaps so, but previous ATi hardware is deficient as the F@H client often has to redo calculations just to ensure correctness, something that doesn't occur when running on NV hardware.
 
I think that ATI have the hardware in cypress to compete. Its just the sofware that is lacking
The core for Radeon has not been updated from the Brook+ codebase, while the NVIDIA code has been updated a once or twice with CUDA. There is no reason why OpenCL could not be employed and bring performance up significantly on AMD hardware.

We're discussing in the GPGPU forum the some of the performance differences that can be gained from going from Brook+ to OpenCL, as the graph here highlights.

It is a good comparison, because it's a real-world example. ATi chooses not to work with Pande Group to help optimize the F@H client for their hardware. Nvidia does the opposite, as is often the case.

The core need to move to OpenCL, we can't completely re-write their code. They have made indications that they will move that direction, but haven't done so yet.
 
Perhaps so, but previous ATi hardware is deficient as the F@H client often has to redo calculations just to ensure correctness, something that doesn't occur when running on NV hardware.
Incorrect. That is something they decided to do due to the limiations with the early Book+ codebase. Remember, the code for the GPU2 client was designed for R600 class hardware and has not moved since then.
 
The core for Radeon has not been updated from the Brook+ codebase, while the NVIDIA code has been updated a once or twice with CUDA. There is no reason why OpenCL could not be employed and bring performance up significantly on AMD hardware.

We're discussing in the GPGPU forum the some of the performance differences that can be gained from going from Brook+ to OpenCL, as the graph here highlights.

The core need to move to OpenCL

Pande Group is working on an OpenCL client, and once again here, the NV client is ahead of the ATi client because NV is working with Pande Group to make sure the code runs, and optimally at that. Latest rumors in the F@H community put the NV OpenCL client available a good 6 months before the ATi client.

we can't completely re-write their code. They have made indications that they will move that direction, but haven't done so yet.

Did NV need to "completely re-write" F@H to get such good performance? Whatever they did, they did it right, and willingly.
 
Incorrect. That is something they decided to do due to the limiations with the early Book+ codebase. Remember, the code for the GPU2 client was designed for R600 class hardware and has not moved since then.

I'm having difficulty finding the link, but I read @ folding forum a quote (from 7im, I believe) that ATi hardware had to redo some calculations to ensure correctness and this was a big reason why it was slower than NV hardware.

Where's Mike Houston when you need him?
 
I'm having difficulty finding the link, but I read @ folding forum a quote (from 7im, I believe) that ATi hardware had to redo some calculations to ensure correctness and this was a big reason why it was slower than NV hardware.
And thats not a "hardware" issue, from a hardware perspective NVIDIA's stuff is no different in this respect.

To Mintmasters point, though, this is a very contrived case due to the differences in the software base being used.
 
The core need to move to OpenCL, we can't completely re-write their code. They have made indications that they will move that direction, but haven't done so yet.
Doing some F@H myself, that's not exactly what I'd have liked to hear as one of your customers.
 
And thats not a "hardware" issue, from a hardware perspective NVIDIA's stuff is no different in this respect.

Just going by what I read on folding forum, which seemed to indicate a hardware deficiency.

To Mintmasters point, though, this is a very contrived case due to the differences in the software base being used.

Well as contrived as it may be, it's one of the biggest factors for me when purchasing a GPU, or GPUs as is the case since I own three high-end PCs with dedicated graphics cards. Further to the point, it is why, when I purchased a high-end graphics card in February '09, I went with a GTX 285 rather than anything ATi had on the market, and again in January and February of this year why I purchased two GTX 275s to upgrade my other two machines. As long as ATi continues to have inferior developer relations when compared to Nvidia, and as long as F@H is not important to them, I will continue to purchase NV hardware.
 
Doing some F@H myself, that's not exactly what I'd have liked to hear as one of your customers.
Sorry? Its their code, not ours! We don't re-write games for developers, do we!? We have dev rel, we assist and get can help optimise and there is no difference between game and stream apps there.
 
Well as contrived as it may be, it's one of the biggest factors for me when purchasing a GPU

Well it's a good thing for wavey that this market consists of less than 10 people. :p

Untrue. See Folding@Home, where GT200b stomps all over anything ATi has, including Hemlock

But this is because F@H places artificial limitations on AMD's client, not because of hardware.

Doing some F@H myself, that's not exactly what I'd have liked to hear as one of your customers.

Why should AMD have to hold developers hand on every little thing. It's up to the developers to write the bulk of the code. AMD is there to give advice and provide minor help on optimization.
 
I'm having difficulty finding the link, but I read @ folding forum a quote (from 7im, I believe) that ATi hardware had to redo some calculations to ensure correctness and this was a big reason why it was slower than NV hardware.
First, the CUDA and the Brook clients use a different code base, different algorithms and calculate different classes of proteins, it's hard to make good comparisons on that base.

Furthermore the ATI version of Folding was developed with an ancient Brook release without support for local memory. AFAIK that is the reason it does actually twice the number of calculations as it is cheaper to redo it than to store it somewhere in memory and load it again later. Newer Brook releases support the local memory of RV770 GPUs, but Stanford never bothered to update their code. The scaling from RV670 -> RV770 -> RV870 ist extremely bad (virtually non-existent), not exactly a sign of an optimal and forward looking coding.

I guess this should not become a discussion of how AMD's devrel department works or should be working, so I won't say anything to those points.
 
Why should AMD have to hold developers hand on every little thing. It's up to the developers to write the bulk of the code. AMD is there to give advice and provide minor help on optimization.

True, but everything AMD produces is just a useless hunk without software. So maybe they should hold some more hands :)
 
I knew someone was going to bring this up.

Well it is the most popular GPGPU application...

That's because the workloads and algorithms aren't the same. I don't even think they fold the same types of proteins.

There are client-specific work units but these don't differentiate between NV and ATi hardware, the same work unit runs on both. The algorithm may be different, but the workload is not, neither is the output.

I'm pretty sure that they don't just share the code with ATI/NVidia and let them write a client.

Both got a GPU2 client written for them once. ATI got it written during the R600 era, which is architecturally very different from Cypress, and NVidia got it written during the G80 era, which is very similar to GT200. There were some tweaks to support new hardware since then, but that's it.

GT200b is nearly twice as fast as G80, and GF100 is about 50% faster than GT200b, despite not having any optimizations in the client. ATi hardware only gains performance in F@H through clockspeed increases. Cypress is no faster than RV770 @ the same clockspeed, despite having twice the ALUs and better register/cache architecture for GPGPU.

Whatever the reason, ATi hardware does not scale in F@H, and NV hardware does.
 
Well it's a good thing for wavey that this market consists of less than 10 people. :p

:LOL: if it's only ten people then you've got 20% of them right here between Carsten and myself ;)

But this is because F@H places artificial limitations on AMD's client, not because of hardware.

I don't believe this is a fair characterization. I don't think Pande Group does less work on ATi GPUs because they want to.

Why should AMD have to hold developers hand on every little thing. It's up to the developers to write the bulk of the code. AMD is there to give advice and provide minor help on optimization.

They shouldn't have to, but there's a reason most popular PC games have TWIMTBP logos in them and tend to run better on NV hardware.
 
Back
Top