AMD: R9xx Speculation

Have you spotted this or not yet? :LOL:
Charlie says: :oops:





http://semiaccurate.com/forums/showpost.php?p=69973&postcount=182
http://semiaccurate.com/forums/showthread.php?t=3249&page=19



Now, I am really excited and happy. Charlie, thank you for making me so happy. :p

4 NI SP are as fast as 5 EG, but they are 25% smaller.

So that's 25%+ perf/mm^2 over EG. If the uncore is 25% more efficient, you get another 25%+ over EG. That's 50%.. not 100%.

I think that the biggest increase in performance will be from double precision.. they will probably go for 1/2 of single precision performance. That's means we could get at least 1.5 Teraflops in DP.. 3x times Fermi's performance. 0_0
 
Perhaps he is in a process of preparing to write his article or just writing it now. :D
Don't get your hopes up , a 100% increase in performance on the same process will never happen , even a 60% sounds pretty far fetched , these are the kind of vaguely written statements directed at ill informed people to make the writer sound deeply informed .

A change in the arrangements of shader clusters will impose heavy modifications on the front end just to cope with the change , let alone increase performance .

Shaders are not the only logic responsible for increasing performance , there are other factors too , texturing , fill rate , z -sampling , interpolation , improvements in these areas require die area , if you cut corners there the chip will be unbalanced , and you will quickly add bottlenecks to the chip , possibly neutralizing any advantage you have in the shaders .

Of course , there is the possibility of increasing effeciency , but first you have to assume that Evergreen is terribly ineffecient for them to have any significant effeciency gains over the last generation , and improving effeciency isn't some easy and magical process , it is usually restricted by numerous factors , and usually results in small performance gains (5~10%) .

We are all expecting a 30~50% performance improvements for NI over EG , (by using 4D shaders and increasing core & TU count ) but only with 10~20% die increase ? that is outright ridiculous and far fetched , this is not wonder land , this is the real world .
 
It may be possible to acheive large gains in subsets of the workload.
Cypress was profiled to have some seriously painful conflict problems in GDS when using tessellation.

The throughput of tesselation could be improved 2-3x, and that would just put it up to "not surprisingly low".
 
I think those leaks (I mean the Heaven benchmark, Crysis, 3D Mark Vantage) which were claimed as fakes are real. Even if they contain signs of being faked. Probably purposely. ;)
 
I think that the biggest increase in performance will be from double precision.. they will probably go for 1/2 of single precision performance. That's means we could get at least 1.5 Teraflops in DP.. 3x times Fermi's performance. 0_0
I highly doubt that. For 2 generations now AMD deemed it completely unnecessary to have DP at all for everything but the fastest chip, despite the apparently small die size this would cost. I can't see why they'd suddenly go for max DP performance.
However, going from 5D to 4D shaders would indeed imply (since the T unit didn't participate in DP calculations) that instead of 1/5 (mul/mad) or 2/5 (add) performance for DP you'd get 1/4 or 1/2, respectively.
 
vzwv9y.jpg


http://www.3dcenter.org/news/7-Tage

Again with the renaming crap ?
 
I highly doubt that. For 2 generations now AMD deemed it completely unnecessary to have DP at all for everything but the fastest chip, despite the apparently small die size this would cost. I can't see why they'd suddenly go for max DP performance.

Personally, 4:1 sp:dp is just fine. 2:1 is tempting, but is expensive in terms of hw. Look at Fermi for a great example of wastage to reach the holy grail of 2:1 sp:dp.
 
We are all expecting a 30~50% performance improvements for NI over EG , (by using 4D shaders and increasing core & TU count ) but only with 10~20% die increase ? that is outright ridiculous and far fetched , this is not wonder land , this is the real world .

Sir,

Rv770 (800SP-256mm2-55nm) and his friend Rv670 (320SP-192mm2-55nm at 75% of the die size) would like to see you in the ass-rape cabinet.

Why not expect a Juniper sized Die with Cypress performance? If they can do it on 40nm, let them show it on 40nm.
 
Last edited by a moderator:
I think that the biggest increase in performance will be from double precision.. they will probably go for 1/2 of single precision performance. That's means we could get at least 1.5 Teraflops in DP.. 3x times Fermi's performance. 0_0

Why, as a customer of a mid end discrete graphics card, would I want die space/money/power budget to be spent on DP FLOPs?
I respect efficiency, not waste.
 
If that was bad planning, then ATI/AMD have been doing bad wafer allocation planning at TSMC going all the way back to the 90's. :p


You are still missing the point.

Tell me, if I'm telling the wrong story:

1. AMD has very good yields but not enough wafers. There's absolutely nothing they could do about the situation after they've made the deal. But given TSMC's 40nm problems, AMD could've been a little bit more skeptical, and they should've been. Am I not right? I'm not talking about overbook or something. But a little headroom wouldn't hurt much especially when all the odds are against TSMC's ambitious plan, right?
2. nVIDIA has secured much much more wafers, perhaps more than they needed and couldn't put them to good use -- that's another story. But they could still re-tape GF100 to make the yield better, couldn't they?

So, my take is AMD has supply problems while nVIDIA is the one with production problems.
 
You are still missing the point.
2. nVIDIA has secured much much more wafers, perhaps more than they needed and couldn't put them to good use -- that's another story. But they could still re-tape GF100 to make the yield better, couldn't they?

You sure? AMD is selling more chips than nv. How are they managing that with much less wafers?
 
Why, as a customer of a mid end discrete graphics card, would I want die space/money/power budget to be spent on DP FLOPs?
I respect efficiency, not waste.

Like for Evergreen.. no DP on Juniper, but DP on Cypress. It's useful for the FirePro and the professional market. Charlie said that a N.I stream processor is slightly bigger than a E.G stream processor many times.
Now it's like this.
CypressSP.jpg


It could be for Cayman:
4 32 bit FP MAD per clock
2 64 bit FP MAD per clock
2 32 bit FP MAD SFU per clock
..

Now, that would be like 2 fatter "SP" capable of handling 2 32bit MAD/1 64bit MAD/1 32bit SFU MAD each per clock.. aka x-t | y-t

:D
 
Like for Evergreen.. no DP on Juniper, but DP on Cypress. It's useful for the FirePro and the professional market.
Actually, when I get off these forums I am the FirePro and professional markets, and to be honest, I don't give a rats ass about the DP FP performance of my graphics cards there either.
Here though, I'm just another overaged game player, and I want AMD to provide me the best possible gameplay value, at a power consumption that preferably can be handled silently. Carrying excess baggage simply gets in the way of that.
 
Back
Top