AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

Babel-17 · Jun 24, 2015

http://www.overclock.net/t/1561804/vmod-amd-radeon-r9-fury-x-4gb-hbm-4096-bit-review if the above linked site is too hammered by traffic. Thanks for the heads-up! Googling the link brought up the hit.
Edit: Some numbers seem weird. In COD Advanced Warfare a roughly 10% overclock sends the GTX 980 around 40% faster. I guess unlocking power can make a difference but not that much. Tired, maybe I'm reading things wrong.

mhouston · Jun 24, 2015

DmitryKo said:
Yes, fixed-length encoding - probably pixel differences in a 8x8 block.

http://graphics.stanford.edu/~mhous...all/HoKo_compression_in_graphics_pipeline.pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.60.8187&rep=rep1&type=pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.81.412&rep=rep1&type=pdf

Inefficient for what exactly - perceived quality, compression ratio, or decoding complexity?

Wow, there is a blast from the past...

eastmen · Jun 24, 2015

Babel-17 said:
http://www.overclock.net/t/1561804/vmod-amd-radeon-r9-fury-x-4gb-hbm-4096-bit-review if the above linked site is too hammered by traffic. Thanks for the heads-up! Googling the link brought up the hit.
Edit: Some numbers seem weird. In COD Advanced Warfare a roughly 10% overclock sends the GTX 980 around 40% faster. I guess unlocking power can make a difference but not that much. Tired, maybe I'm reading things wrong.

-30% OC on a 980Ti gives you nearly 60% more performance
-980Ti OC is almost as fast as 980Ti SLI.
-Titan X OC is faster than 980Ti SLI.

The fury gets slower when overclocked in gta 5 ... by about 6fps
an overclocked ti gains about 25fps in the witcher over its non overclocked counter part

I'm going to go with its full of shit

Jawed · Jun 24, 2015

Alexko said:
Wouldn't you expect just the opposite? At higher definitions, there should be larger regions made up of relatively similar pixels. By "larger" I mean "containing more pixels".

My theory is that at higher resolutions there is too little tile re-use. Literally, pixels from more tiles are more likely to be in flight at any given time. So the compressor/decompressor becomes the bottleneck (more likely the former).

silent_guy · Jun 24, 2015

When does the embargo lift?

flopper · Jun 24, 2015

silent_guy said:
When does the embargo lift?

5 hours

revan · Jun 24, 2015

Forget about those vmod "benchies"... that site dissappiered anyway...

Just for passing time:
http://www.digitalstorm.com/unlocked/amd-fury-x-performance-benchmarks-idnum360/
http://www.digitalstorm.com/unlocked/amd-fury-x-crossfire-gaming-benchmarks-vs-sli-titan-x-idnum361/

... eighteen minutes from now-on

Deleted member 2197 · Jun 24, 2015

Yep, just passing time ...
http://s14.postimg.org/41t0i3z1t/900...a13d_image.jpg

CarstenS · Jun 24, 2015

I especially like how framerates are acurately depicted to the second digit behind the comma... This clearly is the deciding factor when making a purchase decision.

revan · Jun 24, 2015

CarstenS said:
I especially like how framerates are acurately depicted to the second digit behind the comma... This clearly is the deciding factor when making a purchase decision.

Lol... the double precision performance matters ...

BRiT · Jun 24, 2015

Please keep reviews in the review specific thread and keep this as speculation.

Alexko · Jun 24, 2015

The most striking piece of information provided by reviews is that performance didn't increase over Hawaii by nearly as much as theoretical figures suggested. You basically get 20~30%, depending on the display definition, where you might have expected about 45% based on raw ALU throughput or memory bandwidth. The picture is a little blurred by the fact that some 290Xs are poorly cooled (reference cooler in quiet mode) while others use custom coolers but are also overclocked, etc., but basically, Fiji doesn't scale very well.

Now, it's true that it has exactly the same fillrate as Hawaii clock for clock, so that could potentially be a severe limitation, but I'm not sure it actually is. The front-end, after all, is also unchanged.

silent_guy · Jun 24, 2015

The only thing that supports the argument that fill rate is more important than we think it is, is the fact that Nvidia doubled the amount of ROPs and thought it was worth spending the area. One way or the other, they must have been on to something, but I can't explain it.

CarstenS · Jun 24, 2015

Maybe it's not strictly fill rate, but that they need to get data in and out of L2 and have to go through ROPs?

fellix · Jun 24, 2015

I wonder, if it would have been a better decision for AMD to re-balance Fiji's architecture and fit two more setup pipes and bump the ROP count to 96, at the expense of a bit less multiprocessors?
Looking at Tonga's die, a single setup pipe takes roughly the same area as a CU. So, four less CUs (64 -> 60) would balance quite well for two more setup pipes and eight additional ROP clusters, that would definitely utilize the HBM throughput more rationally and would be more "visible" for the purpose of high-resolution gaming/benchmarking.

3dilettante · Jun 24, 2015

fellix said:
Looking at Tonga's die, a single setup pipe takes roughly the same area as a CU.

Which block do you consider to be a setup pipe on that die shot?

I'm curious if someone has an idea of what operations could be making the Techreport's fillrate benchmarks act the way they do for Fiji.
It looks like there's a ceiling, and from all appearances that memory bus is eager to give way more bandwidth than those ROPs can use.

gamervivek · Jun 24, 2015

Where exactly are you thinking that it would land up if there were no ceiling? Need to bring back that Vantage bench instead of the 'fancy beyond3d suite'. Maybe Anandtech would do it in their synthetics bench.

Ok they have updated their GPU benches but there is still no review out. Fury does better at vantage pixel fill than a 980Ti.

http://www.anandtech.com/bench/product/1496?vs=1513

fellix · Jun 24, 2015

3dilettante said:
Which block do you consider to be a setup pipe on that die shot?

I'm curious if someone has an idea of what operations could be making the Techreport's fillrate benchmarks act the way they do for Fiji.
It looks like there's a ceiling, and from all appearances that memory bus is eager to give way more bandwidth than those ROPs can use.

Green outline -- setup block;
Cyan outline -- a single CU;

3dilettante · Jun 24, 2015

I was considering the possibility that the block of what appears to be SRAM in the upper third of the green block was an L2 section.

mczak · Jun 24, 2015

silent_guy said:
The only thing that supports the argument that fill rate is more important than we think it is, is the fact that Nvidia doubled the amount of ROPs and thought it was worth spending the area. One way or the other, they must have been on to something, but I can't explain it.

Don't forget, nvidia also has slow ROPs for certain operations (like fp16 or worse, fp32 blend), which isn't the case for AMD. So, doubling the amount of ROPs essentially fixed that problem without making the ROPs themselves more complex. If the ROPs are reasonably cheap, making sure you are unlikely to ever be limited by them (rather than bandwidth) might make sense. Nvidia also has the rasterizer throughput to make good use of them (they didn't in previous chips), 4x16 pixels on gm204 (same can be said for all other gm2xx chips, so 6x16 on gm200 with 96 ROPs) whereas AMD probably would not (4x32 pixels might not be all that helpful, and I don't get the impression scaling up to 8x16 with the current gcn architecture would be particularly effective or feasible).

AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

Babel-17

mhouston

A little of this and that

eastmen

Jawed

silent_guy

flopper

revan

Deleted member 2197

Guest

CarstenS

Moderator

revan

BRiT

(>• •)>⌐■-■ (⌐■-■)

Alexko

silent_guy

CarstenS

Moderator

fellix

3dilettante

gamervivek

fellix

3dilettante

mczak

Similar threads