Alessio1989
Regular
Tonga XT VS Polaris 10 PRO: http://www.bitsandchips.it/9-hardware/7334-tonga-vs-polaris-sfida-clock-to-clock
Very interesting comparison - good to see it finally going live. I'm a bit sad though because from this set of tests, it seems like the overall improvements over Tonga are rather small judging from the gains achieved there.Tonga XT VS Polaris 10 PRO: http://www.bitsandchips.it/9-hardware/7334-tonga-vs-polaris-sfida-clock-to-clock
How often have we said this when AMD launched an new GPU, maybe too often already.
Did these all arrive at the same price points when entering the market?I figured it's time to start deriving from the other mess of a thread and this is an interesting subject.
Computerbase.de has made a comparison between a Tahiti, a Tonga and a Polaris 10 GPUs, all with the same number of CUs enabled and same clocks.
https://www.computerbase.de/2016-08...ormance/#abschnitt_gcn_1_3_und_4_im_vergleich
This is basically GCN1 vs. GCN3 vs. GCN 4, in the form of a R9 280X, a R9 380X and a RX 470.
In some games the difference is pretty negligible (e.g. Dirt Rally, Thalos Principle) but in others, the performance boost is pretty huge:
In general, the games that have gotten the largest boosts from the architectural improvements are the gameworks games, which tend to push geometry as far as maxwell cards can do.
But there are games like Ashes of the Singularity who have sizable performance boosts too. Maybe the support for a larger number of compute queues in GCN3 and the HWS units in GCN4 are making a difference when async compute is being used.
I wonder what their source is for Tonga having a 512kB L2 cache. AMD never published this information at the time of Tonga's release.Even the quadrupled L2 size with double the throughput still barely hints at any tangible performance benefit. And yes I know, GCN still relies on its proprietary global data share for syncing on top of the dedicated color/depth caches. The faster primitive occlusion and tessellation also doesn't show much contribution, despite the roomier L2 holding the spillovers to the global memory.
Did these all arrive at the same price points when entering the market?
Also, in today's market, are they competitive in price points -- err I mean if you were to buy them new?
Lol no worries. I do appreciate the OT. I was just wondering how they were priced relatively to each other. This is a good showcase of architecture.Both Tahiti and Tonga went through different PCBs, core clocks, memory clocks, etc. during their lifetimes.
Moreover, I think none of the cards in the article are running the clocks they were shipped with, so such a comparison wouldn't make much sense.
The reviewers clocked all cards the same in order to get the same compute and fillrate throughputs, in an effort to compare the architectures and not their position in the market.
That said, I was hoping for this thread to be about discussing what architectural improvements between GCN1-4 have resulted in substantial differences such as the ones we're seeing with Witcher 3 and Ages of the Singularity.
Well isa wise nothing changed between tonga and polaris, just the front end.
The "good guess" isn't good enough. 1MB would be just as good a guess (why not say it's doubled Bonaire instead of Hawaii derivative). 512kB, 1MB, 768kB, 1.5MB (though supposedly for the latter two options part of it would be deactivated) are all options I've seen mentioned somewhere for Tonga - all are good guesses but only one is right...It's a good guess, I think. Hawaii packs 1MB (8 memory controllers, 128kB partition), Tahiti came with 768kB (6 controllers), so it's logical that Tonga keeps the same amount of L2 SRAM per partition.
Though I don' know if that's either good or depressing (or both). On one hand, the newer cards aren't hurting as much with gameworks titles. On the other, AMD is spending R&D resources to counter gameworks in their own hardware. This has to be frustrating as all multiplatform games are console ports coming from a more compute-centric GCN1/2 architecture in the first place.
AMD's performance per ROP (and per unit of fillrate) is far ahead of NVidia, in games.Up next are expected to be the ROPs, where nvidia are now ahead of them. As well as becoming more BW and power efficient, they would also appear to need more of them. They have been limited to a maximum of 16 per shader engine.
AMD's performance per ROP (and per unit of fillrate) is far ahead of NVidia, in games.
AMD's real problem is not doing tile-binned rasterisation. Doing that would have the side effect of "making the ROPs more efficient", but in truth wouldn't make any difference. Not doing work on a triangle you know will be overwritten is a win.
The irony is that with clustered geometry/occlusion algorithms, the need to do tile-binned rasterisation disappears. AMD could re-architect for this just in time for all the advanced engines to do a better job themselves.
Outside the things AMD once said nobody needed (more geometry and tesselation power), I think much of the improvements come from more Cache and better colour compression, so more effective bandwith. It is quite interesting that the biggest gain comes from improved geometry power in the CB tests. I am more and more inclined to think that CGN was not a good architecture for DX11.