AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

BTW:

iDMd7b5.png


Despite everything, it's always good to see an IHV acknowledging that gamers aren't dead after all. :)

Surely that's a piss take created by someone. I'm thinking Pitcairn specifically but it applies nearly as well to the entire range.
 
a passive converter is $25. Active converters are going to be quite a bit more expensive...
Bizlink/Accell HDMI 1.4 converters (300 MHz clock) are both "active" with protocol conversion (EyeFinity certified) and "passive" with Dual-Mode Type 2 support (i.e. work with DisplayPort++ ports that directly emit HDMI signal from the video card), and they all sell for $25-30 - seem to be enabled by latest ParadeTech IC technology. "1.1" and "1.2" versions seem to be feature identical.They also have "passive" converters that should only work with DisplayPort++ Dual-mode Type 2.
I'd guess their upcoming HDMI 2.0 converters will be similarly inexpensive.

More in this forum post: http://forums.guru3d.com/showthread.php?p=5100304#post5100304
 
Last edited:
Wow, the performance boost in Witcher 3 with Hairworks on is no less than 50%:


0enj2iX.gif
I'm pretty sure, with the new drivers, this now pathetic looking red line would be somewhere up there between the other two lines.

Interesting, if there were hardware changes it should show up in all benchmarks, wonder what the drivers are doing to get increased tessellation performance
Would be interesting to see if this is only with witcher 3 or with other non gameworks titles.
Maybe the drivers finally got some deserved love? I think witcher specifically tessellated Geralt's hair quite a lot. There's been a tweak around for a while to limit tessellation to factor 8 or sth. in Catalyst Control Center in order to improve performance.
 
I'd guess their upcoming HDMI 2.0 converters will be similarly inexpensive.
But do those converters also enable HDCP 2.2 for 4k blu rays or streaming videos? I do not care that much about the protocol my graphics cards and display communicate with each other, in fact I don't care at all, as long as I don't miss out on image quality (DSub15), content (HDCP sadly - i wish there was no DRM stuff) and refresh rate (DP 1.2).
 
But do those converters also enable HDCP 2.2 for 4k blu rays or streaming videos?
UHD Blu ray is out of question, as HDCP has to be supported throughout the complete signal decoding chain on the PC, including Blu ray media, stream decoding, and uncompressed video out. For that you will need DisplayPort 1.3 and updated Dual-Mode "passive" adapters that support HDMI 2.0 clocks.

Does video streaming really requires HDCP though?
 
Last edited:
I suppose I was approaching the contention issue in a design scenario where occupancy issues caused both the rasterizer and export bus to be underutilized in smaller CU counts. Raising the CU count could eventually reduce occupancy constraints to the point that enough wavefronts need enough export cycles that the underutilization goes away.
Fiji is basically four shader engines versus the two of Tahiti. Rasteriser, versus shader versus export versus RBE versus memory are fundamentally the same in both when normalised per shader engine.

So the right kinds of tests with these two chips could reveal useful patterns - with the jokers being delta colour compression, MC granularity and bus architecture.

The ROPs came to my mind first because their execution process is linked to moving tiles in and out from memory, which would not scale with overclocking the core and would exert back pressure to wavefronts trying to export. My assumption, perhaps incorrect, was that the rasterizer's end of the process has a higher likelihood of sourcing its data from on-chip and so could benefit from higher GPU clocks relative to the more memory-heavy ROPs, leading to the rasterizer stage and the CU array both waiting with ready outputs for the ROPs to catch up and free up export buffers.
Yes, I see that mechanism.

Is it notably different from Tahiti in its behaviour? Will the jokers completely obfuscate the differences we find?
 
Isn't it really weird that Fiji didn't double the geometry engines of Tonga, since it's basically 2xTonga in almost everything else but the memory controller?

Doesn't this look like AMD is asking to get their ass handed to them in Gameworks titles with excessive geometry everywhere?
 
Isn't it really weird that Fiji didn't double the geometry engines of Tonga, since it's basically 2xTonga in almost everything else but the memory controller?

Doesn't this look like AMD is asking to get their ass handed to them in Gameworks titles with excessive geometry everywhere?


It could have been a great donga, but it's really just a bonga.

Dude.
 
Isn't it really weird that Fiji didn't double the geometry engines of Tonga, since it's basically 2xTonga in almost everything else but the memory controller?

Doesn't this look like AMD is asking to get their ass handed to them in Gameworks titles with excessive geometry everywhere?

According to Techreport, the geometry engines have at least been changed compared to Tonga to increase throughput:
http://techreport.com/review/28499/amd-radeon-fury-x-architecture-revealed/2

Perhaps GCN geometry engines are quite power and/or space intensive. It could help explain Tonga's disappointing fps/mm2 and fps/watt improvement over Tahiti.
AMD was already pushing the reticle limit with Fiji so something had to give.
 
According to Techreport, the geometry engines have at least been changed compared to Tonga to increase throughput:
http://techreport.com/review/28499/amd-radeon-fury-x-architecture-revealed/2

Perhaps GCN geometry engines are quite power and/or space intensive. It could help explain Tonga's disappointing fps/mm2 and fps/watt improvement over Tahiti.
AMD was already pushing the reticle limit with Fiji so something had to give.

Q:

Is that firmware/driver-related :?:
 
Isn't it really weird that Fiji didn't double the geometry engines of Tonga, since it's basically 2xTonga in almost everything else but the memory controller?

Doesn't this look like AMD is asking to get their ass handed to them in Gameworks titles with excessive geometry everywhere?

Didn't the same happen with 5870 and 5770, something similar with dx12 overhead benchmarks where cards would be at parity from the same family. The bigger problem, in light of the improvement of Tonga and further and drivers supposedly boosting 390X(Hardocp review showing impressive improvements against 290x), would be the ROPs which AMD are again in a deficit of after 7970 vs. GK100.

As for 390X, at least one reviewer who are not AMD themselves(so much for transparency), shows it performing near 980Ti for a few games.

GTAV - 31.3 to 29

Evolve - 39.6 to 37.7

FC4 - 37.9 to 36.4

And over 980 for most.

http://nl.hardware.info/reviews/613...et-bestaande-chips-benchmarks-alien-isolation

After reading some of the other reviews, 390X doesn't look half as bad, perhaps some part of it due to 290X being used in those reviews the throttling reference ones.
 
UHD Blu ray is out of question, as HDCP has to be supported throughout the complete signal decoding chain on the PC, including Blu ray media, stream decoding, and uncompressed video out. For that you will need DisplayPort 1.3 and updated Dual-Mode "passive" adapters that support HDMI 2.0 clocks.

Does video streaming really requires HDCP though?
https://help.netflix.com/en/node/6662
Fwiw though, I used a 7950 with Netflix, and with no problems over HDMI to my Sony XBR6. Oddly enough I had an issue when switching to a GTX 970. It was puzzling and all I can remember is that I used a shotgun approach and tried everything, and then it was working. I should have noted what seemed to do the trick at the time as it's useful to post such, for google users looking for an answer.
Edit: It could have been almost anything as I'd just reinstalled Windows and was still setting things up.
 
Last edited:
Didn't the same happen with 5870 and 5770, something similar with dx12 overhead benchmarks where cards would be at parity from the same family. The bigger problem, in light of the improvement of Tonga and further and drivers supposedly boosting 390X(Hardocp review showing impressive improvements against 290x), would be the ROPs which AMD are again in a deficit of after 7970 vs. GK100.

As for 390X, at least one reviewer who are not AMD themselves(so much for transparency), shows it performing near 980Ti for a few games.

GTAV - 31.3 to 29

Evolve - 39.6 to 37.7

FC4 - 37.9 to 36.4

And over 980 for most.

http://nl.hardware.info/reviews/613...et-bestaande-chips-benchmarks-alien-isolation

After reading some of the other reviews, 390X doesn't look half as bad, perhaps some part of it due to 290X being used in those reviews the throttling reference ones.
IIRC the 5770 had the same tessellation engine as the 5870 (it was a factor in my buying one). I think everything else, including the memory bus, was exactly half. Definitely not positive about that. :)
 
Fiji is basically four shader engines versus the two of Tahiti. Rasteriser, versus shader versus export versus RBE versus memory are fundamentally the same in both when normalised per shader engine.
Tahiti had a limited crossbar between each RBE and a few other memory channels in order to mate them with its mismatched bus width. Hawaii dispensed with it.
Besides the extra bandwidth, it was stated that the extra flexibility allowed ROPs to perform work if one of their associated memory controllers was busy.
http://techreport.com/review/22192/amd-radeon-hd-7970-graphics-processor/4

I'm curious if Fiji would bring that back. The bus width is a power of two, but the ratio of ROPs to channels is lower than it has been since Tahiti. Granted, Tonga's apparently unused portion of the memory PHY on its die might be why that pattern remained true longer than it should, but that also might mean that Tonga has some kind of crossbar or 16 dead ROPs since turning an interface off doesn't wipe away the silicon..

Is it notably different from Tahiti in its behaviour? Will the jokers completely obfuscate the differences we find?

I am curious where the compression hardware is in the process. The least disruptive would be to have it on the path between the ROPs and the memory controllers (possibly in the controllers?), although what that does to a possible memory crossbar is unclear.
The compression and decompression process mostly intends on saving bus accesses, although that may not do much good for individual tile export or import from the ROP caches since those have to be processed and an extra DRAM burst or two is a handful of cycles at most.
The downside to that is that the ROP caches would have uncompressed data, so their hit rates would not improve. HBM's latency could be better than before, but the dominant factor is the DRAM arrays, which have not changed much.
So individual ROPs that work by burning bandwidth to hide latency in a loop tuned to match their storage and processing capacity would have the same capacity to hide latency on a potentially worse latency basis, despite bandwidth being saved. Having more ROPs might mean more buffers and caches, meaning more aggregate latency hiding, although if the compression hardware is closer to the controllers that could bottleneck.
 
Counting geometry engines is not enough to explain the differences until AMD publishes size of parameter caches. And even L2 of Tonga, if Carizzo is any clue it can be 2 MB, like rumored Fiji.
 
Back
Top