AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

hm... I kind of just figured they took Tahiti and upgraded it to GCN1.2.

Tonga has 700M more transistors than Tahiti. That's roughly the equivalent of a full G92 GPU.
I'd have a hard time believing that color compression + Audio DSPs + 2 extra tesselation engines + 6 ACEs + other optimizations would take 700M transistors. Even more considering how the Crossbar was apparently taken away.
 
The crossbar wouldn't be needed if there were 48 ROPs. It existed in Tahiti because of the extra memory channels.
The most recent GCN revisions have been described as capping the number of render back ends per shader engine to 16, and it doesn't look like the shot of Tonga has a third engine.
Is there an extra 8 blocks for ROPs in the shot? I think those might be the row of paired rectangles (depth and color?) that line up alongside the CUs to the left and right of the CU array.
hm... zoom and enhance XD
http://i.imgur.com/Q5MVBsz.jpg
 
Tonga has 700M more transistors than Tahiti. That's roughly the equivalent of a full G92 GPU.
I'd have a hard time believing that color compression + Audio DSPs + 2 extra tesselation engines + 6 ACEs + other optimizations would take 700M transistors. Even more considering how the Crossbar was apparently taken away.
Most likely there are just a lot of different power control circuitry that may or may not be enabled. The move from Kaveri to Carrizo had a huge increase in transistor count too without increasing the core configuration significantly.
 
Most likely there are just a lot of different power control circuitry that may or may not be enabled. The move from Kaveri to Carrizo had a huge increase in transistor count too without increasing the core configuration significantly.

Yet I failed to mention that from Tahiti to Tonga the DP output went from 4:1 to 16:1. Tonga has 1/4th the DP output of Tahiti.

Plus, Carrizo brought an embedded Southbridge, new CPU cores with 15% higher IPC, AVFS, a secure Processor with an ARM core + cache + ROM inside and there's a chance the chip needs hardware to comply with both DDR3 and DDR4 memory (the same chip will probably be used for Bristol Ridge).
I don't really think the Kaveri -> Carrizo transition is comparable.
 
Yet I failed to mention that from Tahiti to Tonga the DP output went from 4:1 to 16:1. Tonga has 1/4th the DP output of Tahiti.

Plus, Carrizo brought an embedded Southbridge, new CPU cores with 15% higher IPC, AVFS, a secure Processor with an ARM core + cache + ROM inside and there's a chance the chip needs hardware to comply with both DDR3 and DDR4 memory (the same chip will probably be used for Bristol Ridge).
I don't really think the Kaveri -> Carrizo transition is comparable.
The CPU cores are smaller than before. The memory controller isn't larger. The ARM core is insignificant. A billion more transistors just for memory bus and southbridge?
 
The most recent version of GCN did up some of the complexity of the scalar unit's pipeline since it supports scalar writes, and various updates that could offset the reduction in DP throughput at the CU level, which may apply to the Tahiti/Tonga and Kaveri/Carrizo comparison.
Carrizo's L2 caches are lower in capacity, but the x86 cores themselves should be relatively larger due to increases in L1 capacity and general tweaks to the pipeline.

One possible change is the introduction of HDL to the APU side, which I am not clear if a similar change in implementation has occurred on the discrete side--although it's been long enough that there should have been at least iterative changes in the implementation of the designs.
That might lead to a change in terms of how units are implemented with more dense layout.

Another maybe that can compound the HDL/implementation evolution factors is a question of counting methodology:
http://www.anandtech.com/show/7003/the-haswell-review-intel-core-i74770k-i54560k-tested/5
The two numbers for the most common Haswell configuration, Haswell GT2 4C, are 1.4 billion schematic transistors and 1.6 billion layout transistors. Why and what is the difference? The former count is the number of transistors in the schematic (hence the name), and is generally the number we go by when quoting transistor counts. Meanwhile the second number, the layout number, is the number of transistors used in the fabrication process itself. The difference comes from the fact that while the schematic will use one large transistor – being a logical diagram – production will actually use multiple transistors laid out in parallel for layout and process reasons.
This is a decent amount of error margin on top of design changes and new integrated hardware. It has not been clear all the time which count was being used, and HDL might increase the number of process transistors needed for a given schematic count.

On top of that, let's note that AMD has a history of widening the error bars further:
http://techreport.com/news/22100/amd-corrects-muffed-bulldozer-transistor-count
 
I'd have a hard time believing that color compression + Audio DSPs + 2 extra tesselation engines + 6 ACEs + other optimizations would take 700M transistors.
That's true. As for delta color compression AMD states "5-7% improvement on games for modest silicon area (0.2%)" for Carrizo.
 
Another maybe that can compound the HDL/implementation evolution factors is a question of counting methodology:
http://www.anandtech.com/show/7003/the-haswell-review-intel-core-i74770k-i54560k-tested/5

This is a decent amount of error margin on top of design changes and new integrated hardware. It has not been clear all the time which count was being used, and HDL might increase the number of process transistors needed for a given schematic count.
GPUs are fully synthesised. Is that description of schematic/layout at all relevant to a synthesised implementation?
 
The CPU cores are smaller than before. The memory controller isn't larger. The ARM core is insignificant. A billion more transistors just for memory bus and southbridge?

The CPU cores and the memory controller are smaller because they're denser, not because they have less transistors. We were discussing transistor count. Carrizo's area is actually very similar to Kaveri's. A Cortex A5 + cache + ROM + glue logic isn't that much insignificant. The difference between Carrizo and Kaveri isn't 1 billion transistors, it's less than 700 million.
 
In what should be another indicator that Tonga will have its lifetime extended into 2016 and beyond, AMD just launched their GPU virtualization FirePro cards and they both use that chip.
One of the cards uses a dual-Tonga configuration. This probably won't come to the home market, but a ~$350 dual-Tonga card would certainly be a cheap and compact solution for VR.
 
Anyone knows what AMD is doing with the fully enabled chips for the 360 and 370 since they still don't have a 360x or 370x and both of the former cards are salvaged parts. Although the price segments for that market is very crowed, I would think AMD would rather not only sell cut down chips if they have full working chips.
 
Anyone knows what AMD is doing with the fully enabled chips for the 360 and 370 since they still don't have a 360x or 370x and both of the former cards are salvaged parts. Although the price segments for that market is very crowed, I would think AMD would rather not only sell cut down chips if they have full working chips.
A R9 370X desktop card does exist, but it is only being sold in China and Korea.
Although it is difficult to find anything other than the iMac using one, there are a few current mobile cards using a fully enabled Bonaire chip: R9 M380, R9 M385, R9 M385X, Firepro M6100.
 
Back
Top