If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#52 |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
I think I can pedant myself out of this!
|
|
|
|
|
|
#53 |
|
Junior Member
Join Date: Jan 2007
Posts: 22
|
|
|
|
|
|
|
#54 | |
|
Member
Join Date: Sep 2005
Posts: 206
|
FH
Quote:
|
|
|
|
|
|
|
#55 |
|
AndyTX
Join Date: May 2004
Location: British Columbia, Canada
Posts: 1,840
|
I spoke to some AMD people at SC07 (come by the RapidMind booth if you're there!
- the card is indeed R670-based - graphics parts to come soon - the graphics parts will also support double precision - the FireGL offering will basically be a superset of this card (i.e. R670, 2GB RAM but with display outputs naturally) - double precision is 50% speed, but no fused MAD (seems in line with the ~40% figure quoted earlier in this thread) So maybe some hat eating/video recording to come yet? Anyways I can't stand 100% behind these facts since the AMD people may even have been in error, but they are probably fairly reliable, and nothing too crazy. |
|
|
|
|
|
#56 | |
|
...
Join Date: Feb 2002
Location: Cleveland
Posts: 4,220
|
Quote:
__________________
IBSL: 2835, 6541, 8531, 9299, 20484, 86985, 87130 FBSL: 7221, 9255, 15892, 20484 |
|
|
|
|
|
|
#57 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,947
|
The "5'th" special function scalar is actually not invloved in DP calculations.
|
|
|
|
|
|
#58 | |
|
AndyTX
Join Date: May 2004
Location: British Columbia, Canada
Posts: 1,840
|
I don't know much about the hardware, but I think 50% DP seems to imply a fairly full utilization of the silicon in both single and double cases, no?
Quote:
Now I'm wondering whether transcendentals are also 50% and if so, are they accurate to more bits in DP, or the same as SP? |
|
|
|
|
|
|
#59 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
That puts RV670's DP throughput at less than 25% of the peak FLOPS figure given for the chip in the AMD PDF.
My math earlier was based on peak marketing numbers, which I were going by the FMADD peak. Actually, the 5th special function ALU being left out leaves my earlier math slightly optimistic when it comes to price/performance and performance/watt in the redundancy case.
__________________
Dreaming of a .065 micron etch-a-sketch. |
|
|
|
|
|
#60 |
|
Regular
|
See this document
http://www.cs.berkeley.edu/~samw/pro...ore/sc2007.pdf for an example of how "useless" x86 (particularly Intel) is, and why "peak" is such a meaningless concept. The applications covered, such as Proteins, FEM and circuit simulation look like reasonable evaluation candidates for GPGPU workloads... Jawed |
|
|
|
|
|
#61 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
I've held off on discussing the difference between peak and sustained performance because it would have bloated my initial post with a discussion that is too complex to resolve for a general case.
Because at the time there was no confirmation of RV670's DP output, I only had AMD's peak SP numbers with which to compare to Intel's DP flops per cycle. I did not think it fair to compare x86 sustained real-world flops to a GPU's unverified peak DP based off its overly optimistic SP FMADD peak. It's not like we can't think up workloads where GPUs fall on their faces and cases where CPUs get near peak (or cases where the R600 has faltered in comparison to another GPU). It's not something I wanted to derail the thread with. My initial point is that a GPU is a clear price/performance winner at SP, even if it is running the same work on two cards. I then went on to point out that at the (slightly optimistic) estimate of 25% DP throughput, that same tandem arrangement is not so clear a win. Even if the individual card is an order of magnitude better at SP performance, cutting overall performance of the GPU by quartering its performance and then halving it again by doing tandem work removes that order of magnitude, particularly at the price point given. Now, if it were a system of gamer RV670s, the price/performance is blasted out of the park, given it would cost an order of magnitude less.
__________________
Dreaming of a .065 micron etch-a-sketch. |
|
|
|
|
|
#62 |
|
Mostly Harmless
|
Assuming the bit about DP being exposed on the consumer version as well turns out to be true, I wonder what F@H will do with an X2? Treat it like two separate cards that you have to run separate instances on?
/me starts to get moderately interested in an X2
__________________
"We'll thrash them --absolutely thrash them."--Richard Huddy on Larrabee "Our multi-decade old 3D graphics rendering architecture that's based on a rasterization approach is no longer scalable and suitable for the demands of the future." --Pat Gelsinger, Intel ". . .its taking us longer than we would have liked to get a [Crossfire game] profiling system out there" --Terry Makedon, ATI, July 2006 "Christ, this is Beyond3D; just get rid of any f**ker talking about patterned chihuahuas! Can the dog write GLSL? No. Then it can f**k off." --Da Boss |
|
|
|
|
|
#63 |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
At the risk of eating more hats, I'd be shocked if DP is available on consumer boards. Otherwise, why would you buy a FireStream? There may well be 1GB HD 3870s, and you'd be able to buy, what... at least 7 or 8 of those for the price of one FireStream?
|
|
|
|
|
|
#64 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,947
|
Its between 20%-40% dependant on the mix of operations.
|
|
|
|
|
|
#65 |
|
Regular
|
Dunno how you come to that conclusion ... hell I don't know how you let someone get away with saying "50% speed" to you without saying "Huh?". It's ambiguous to the point of not really meaning anything.
|
|
|
|
|
|
#66 | |
|
Member
Join Date: May 2004
Location: Somewhere, IN USA
Posts: 313
|
Quote:
Plus with folding and other GPGPU apps, which may want to use DP, it doesn't do a whole lot of good if it's not available on any of the consumer cards. Besides HDR with FP64 is the next best thing!
|
|
|
|
|
|
|
#67 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
With those numbers, the lower bound would make doubling up on firestream cards for redundant execution a weak proposition, while the upper bound would make the scheme incrementally price/performance-competitive at the suggested price.
I'm liking the idea of buying 8 (4 if I'm not buying another node to host the cards) HD3870s, if they are DP-capable as well. At consumer card prices, a redundant execution scheme would be a worthwhile investment, unless Firestream comes with some very nifty order-of-magnitude exclusives. Or I could just run 8 of them full-tilt if I could care less about data integrity.
__________________
Dreaming of a .065 micron etch-a-sketch. |
|
|
|
|
|
#68 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,947
|
You have to take efficiency for the particular task you are looking at into account as well. We already have 3rd party measurements for DP FFT calcutions that are higher throughput than that of the theoretical rates of the CPU's you are talking about.
|
|
|
|
|
|
#69 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
I'm not disputing that GPUs can be more efficient.
In the case where we don't buy 2 cards to run the exact same work unit, I'd say it's an across the board win for anything GPUs don't fall on their face running. It's only in DP with redundant cards that there is more argument. I agree that Firestream can be a good buy, just not so amazingly good at that pricepoint that buying multiples to run redundant work units isn't a waste. Other factors such as rack space and the cost of surrounding hardware also go up when a system doubles the number of cards to run the same amount of work, though I've tried to simplify matters to avoid bogging down on the large number of possible configurations.
__________________
Dreaming of a .065 micron etch-a-sketch. |
|
|
|
|
|
#70 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,947
|
Excuse me, why are we buying redundant cards?
|
|
|
|
|
|
#71 |
|
Nutella Nutellae
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
|
cosmic rays ftw.
__________________
[twitter] More samples, we need more samples! [Dean Calver] The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way |
|
|
|
|
|
#72 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
That was brought up earlier in the thread when dealing with very large clusters. Error rates as they are for DRAM on a system with terabytes of RAM would be on the order of one error about every five minutes.
__________________
Dreaming of a .065 micron etch-a-sketch. Last edited by 3dilettante; 14-Nov-2007 at 20:44. Reason: added quote for clarity |
|
|
|
|
|
#73 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,947
|
I would, then, think about the applications for a single add-in card product such as been announced here. Solutions for large clusters are going to be different.
|
|
|
|
|
|
#74 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
|
If such a demarcation exists or will exist, it would make sense.
Smaller systems would have a wider window between errors and would be less bogged down by the overhead of repeated calculation. I did not see a statement to the effect that the FireStream 9170 was a single add-in card product. Considering the rise of cheap clusters in HPC, I interpreted the product's being touted for HPC to indicate it would match that portion of the target market.
__________________
Dreaming of a .065 micron etch-a-sketch. |
|
|
|
|
|
#75 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,947
|
|
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|