AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

If the temperature target is what it turns out to be, it's a little interesting to see AMD flip its temperature target to pre-290 levels.
There are signs with Fury X that power consumption ramps a fair amount at the modest temps of the water-cooled setup, so perhaps this is the case for Fury that it is too big for the inflection point for temp/performance to sit where Hawaii had it. I'd imagine that the DRAM would appreciate it running cooler than 95C, but papers discussing stacked DRAM give a ceiling at 85 before needing a higher refresh rate. Various DRAM datasheets put a limit in their ability to distinctly measure temperature at 85 as well.
 
That's not much difference in target temperature between the Fury and Fury X. 75 C versus 65 C. It'll be interesting to see the power draw on the regular Fury. Same clocks but 10% less CUs.

Really tempted by the Fury X, just due to noise level at load (not a single high end/performance/enthusiast level card has acceptable acoustics to me nowadays). But that price. It's hard for me to justify a purchase of anything over 500 USD.

Regards,
SB
 
All I can think of is how hard the Fury cards' prices are going to plummet as soon as cards with 6-8GB of HBM2 are out there.
 
That's not much difference in target temperature between the Fury and Fury X. 75 C versus 65 C. It'll be interesting to see the power draw on the regular Fury. Same clocks but 10% less CUs.

Really tempted by the Fury X, just due to noise level at load (not a single high end/performance/enthusiast level card has acceptable acoustics to me nowadays). But that price. It's hard for me to justify a purchase of anything over 500 USD.

Regards,
SB

I wouldn't be shocked if the non-X Fury ended-up drawing as much (or maybe slightly more) power due to higher leakage, because of the higher operating temperature.
 
I wouldn't be shocked if the non-X Fury ended-up drawing as much (or maybe slightly more) power due to higher leakage, because of the higher operating temperature.

Sure but it's also using 10% less CUs. I'm thinking the power draw is going to be fairly similar with Fury potentially consuming more up to a point. After which the Fury X will have the higher power draw as it won't need to throttle to maintain it's 65 C temperature target. For example, using Furmark as a pathological case, I don't think it will be possible for the Fury to draw nearly as much power as the Fury X in that case as it will require throttling to maintain 75 C in that benchmark. That will also extend to use cases where the ALUs are utilized extensively, but likely won't manifest in most games.

It's also possible that with the air cooler not being as efficient as removing heat, that it'll hit a power/temperature wall sooner, meaning less power draw and lower clocks at 75 C compared to the Fury X at 65 C. This could mean it consumes less power in most scenarios and a much larger performance deficit than you would expect from a card with the same clocks and 10% less CUs as it won't reach max clocks as often.

Hence, why I'm curious to see how it'll actually compare.

Regards,
SB
 
All I can think of is how hard the Fury cards' prices are going to plummet as soon as cards with 6-8GB of HBM2 are out there.

True but we're a good while away from that yet. As far as I'm aware the first HBM2 cards won't show up until next year which means Fury has well over half a year before any better memory solutions show up.

I do wonder what the first HBM2 card will come with though. It's such a more versatile standard than HBM that I expect we'll see all kinds of setups. For example a single stack should be more than sufficient for mid range Pascal parts with 256GB/s at 4GB capacity (more bandwidth than the GTX 980). While dual stack can probably cover the high end with with 512GB/s at 8GB (a greater than 50% increase over Titan X). Or maybe they'll prefer 3 stacks at 768GB while keeping RAM capacity at a more practical 6GB (I guess it depends on how the cost of the additional stack compares with the higher memory capacity). I certainly have my doubts over whether we'll see 8GB 4 stack parts at 1TB/s from the first wave of GPU's. Although I wouldn't put it past AMD to go down that route with Arctic Islands - they obviously need the bandwidth more, at least with GCN1.2.
 
Theoretical fillrate is precisely 0% faster than Hawaii per clock. Other things in Fiji are also limited in their advantage over Hawaii.
Haven't we've been told time and again, how all-important compute is? (Or was it just because "more compute" was easiest to add?) Additionally, Fiji supposedly is a 4k machine (ask AMD about it!) so you might think they also scaled relevant aspects for that job which they obviously didn't. Hence my assertion that we're missing performance.

If you want to evaluate scaling compare with HD7970 where almost every parameter of Fiji is twice that of Tahiti per clock. Or against Tonga/Antigua. That's why I mentioned my post in the review thread where 2 games do scale as expected, at least at Hardware.fr. Most games don't.
2 games out of all the internet. Well, I can add (and in fact did in the post you quoted), that it scales as expected in most directed tests (geometry, pixel fill, compute) as well. So, overall we're still missing the major part of 45% (minus 2 games out of... 1000?) performance.

Why? No good answer. Driver? CPU overhead? Geometry bottlenecks? API overhead? etc. In my opinion, once we have a per-game answer to that question we can get somewhere.
I am aware of all that, thanks. Sadly, my day only has 24 hours, I have 2 hours commute, a real life and still need sleep despite my best efforts to cancel that with caffeine.

On that page the same person has posted results for 2x HD7970

http://luxmark.info/node/417

and 3x HD7970

http://luxmark.info/node/639

Those results don't indicate linear scaling with GPU count.

Have you tried underclocking Fury X to observe variations in performance on this test? Prolly easier to measure differences with underclocking than overclocking, since there's more range to play with!
On the scaling: https://forum.beyond3d.com/threads/luxmark-v3-0beta2.56400/page-3#post-1857415
It should be obvious that multi-gpu scaling can reach intra-gpu scaling in very few circumstances.
No, I haven't tried underclocking - but you're right, it should be much easier to find out limits. It's what I do with integrated graphics most of the time.

--

How do you know that the first HBM2 will not be with AMD Arctic Islands ?

We don't.
 
There's the obvious power and size advantages, to say nothing of the 14% higher bandwidth. I'm unaware though how a single HBM2 stack compares to a 256bit GDDR5 interface in terms of implementation cost. Presumably 1 HBM2 stack is vastly easy/cheaper to implement than the 4 stacks on Fury.
 
There's the obvious power and size advantages, to say nothing of the 14% higher bandwidth. I'm unaware though how a single HBM2 stack compares to a 256bit GDDR5 interface in terms of implementation cost. Presumably 1 HBM2 stack is vastly easy/cheaper to implement than the 4 stacks on Fury.

I don't know what the cost-benefit trade-off looks like exactly, and for desktop parts it may not be worth it, but I suspect that for mobile ones it is. Since IHVs use the same parts for both markets, and the mobile market commands higher margins (perhaps higher volume for low/mid-range GPUs as well?) HBM2 might make a lot of sense.

With just one memory stack, a graphics card would have a very small footprint, which OEMs are bound to like.
 
How do you know that the first HBM2 will not be with AMD Arctic Islands ?
Well, doh, JHH told the internetz that AMD is their biach, testing HBM for NVidia and Pascal is going to be awesome and have HBM2, look at this picture of Pacal. Those aren't HBM2 modules but you kidz are stoopid, you'll think they are.
 
I didn't say they wouldn't. Although at this stage there has been more information released about Pascal than Arctic Islands so I'm inclined to believe Nvidia will move first on that generation.

True, although AMD has generally been good at hiding future GPU details until near release (Fiji being somewhat of an exception because of its relatively drawn-out process from tape-out to release).
 
Haven't we've been told time and again, how all-important compute is? (Or was it just because "more compute" was easiest to add?) Additionally, Fiji supposedly is a 4k machine (ask AMD about it!) so you might think they also scaled relevant aspects for that job which they obviously didn't. Hence my assertion that we're missing performance.
I have no idea why AMD keeps pushing compute, when it's not been the route to success since R600.

There's two possibilities here: AMD just adds compute, because that's what AMD does. Or, AMD's compute is seriously broken in games (not tests), with utilisation that gets worse the more CUs are added to a shader engine and/or the more CUs there are. So AMD piles on extra CUs to get small increments in game performance.

2 games out of all the internet. Well, I can add (and in fact did in the post you quoted), that it scales as expected in most directed tests (geometry, pixel fill, compute) as well. So, overall we're still missing the major part of 45% (minus 2 games out of... 1000?) performance.
Actually, I expect that 900 of those 1000 games do scale. But reviews are centred upon the AAA games which cause CPUs and GPUs grief.

Directed tests are clearly not telling us much that's useful.

Hence my suggestion to play with the graphics options in games to identify the features that kill performance scaling.

I am aware of all that, thanks. Sadly, my day only has 24 hours, I have 2 hours commute, a real life and still need sleep despite my best efforts to cancel that with caffeine.
Perhaps start with a single game. BF4 appears to have dozens of options.

Though I think Lauritzen identified a serious bottleneck in GCN with the tiled lighting scheme that BF4 uses, so BF4 might not be a good candidate because that's known as "broken", and pretty fundamental to the game.

On the scaling: https://forum.beyond3d.com/threads/luxmark-v3-0beta2.56400/page-3#post-1857415
It should be obvious that multi-gpu scaling can reach intra-gpu scaling in very few circumstances.
Someone with more than 1 of: HD4870, HD5870, HD7970 and Fury X needs to post numbers for Hotel. Then we can have a discussion. Fury X should be ~8x faster than HD4870, ~4x faster than HD5870 and ~2x faster than Tahiti if the benchmark is single GPU scalable in the way that you assert.
 
Last edited:
There's the obvious power and size advantages, to say nothing of the 14% higher bandwidth. I'm unaware though how a single HBM2 stack compares to a 256bit GDDR5 interface in terms of implementation cost. Presumably 1 HBM2 stack is vastly easy/cheaper to implement than the 4 stacks on Fury.
I think cost will still trump everything, with HBM migrating very slowly from the top to the bottom.

Weren't there some GDDR5 road maps with 8Gbps? That'd solve the 14%. The power for 14/16nm will already be quite a bit lower than 28nm, so that's totally manageable as well.

And, while obviously a nice secondary feature, I stil think that performance and price is more important. There are some pretty small GTX 970s in the market, so the difference isn't that large.
 
Weren't there some GDDR5 road maps with 8Gbps? That'd solve the 14%.

Possibly, but what are the chances of higher clocked HBM2 as well?

The power for 14/16nm will already be quite a bit lower than 28nm, so that's totally manageable as well.

True that core power draw will be lower, but any power saving is always beneficial, granted at the low and mid range though that might not be a big deal for the desktop parts.

And, while obviously a nice secondary feature, I stil think that performance and price is more important.

But do we know that a 256bit GDDR5 interface (not exactly low end) is cheaper than a single HBM2 stack?
 
Really tempted by the Fury X, just due to noise level at load (not a single high end/performance/enthusiast level card has acceptable acoustics to me nowadays). But that price. It's hard for me to justify a purchase of anything over 500 USD.

Well I wonder how many have you listened and in what sort of airflow setup. My EVGA GTX 980 ACX 2.0 is pretty much inaudible during load and that's with the case fans being pretty much inaudible as well.
 
Back
Top