Might be a repost, but I didn't see it
That would be highest res of the Tahiti die I've seen so far?
Though looks like they did less than perfect job scraping layers
Might be a repost, but I didn't see it
Quite reliable source said that the $400 7870 & $280 7850 prices are fake
Might be a repost, but I didn't see it
Might be a repost, but I didn't see it
The latest beast in the graphics card industry Tearing down a graphics card is not our normal ballywag because images of the board are often available on any number of popular review sites like Tom’s Hardware and AnandTech. In this case, after spending some time looking at the card, we determined that there are some impressive silicon stories that would interest our readers in the semiconductor and electronics space.
For those more casually interested in teardowns, first a bit of back story on graphics cards. If you don’t pay much attention to this type of technology, these things are typically used by high-end gamers who consider speed and quality to be a visceral and competitive advantage when gaming. They take glory in the fact that they can now play games like Metro 2033 at 85 frames-per-second (fps) when before they were limited to 78 fps. By way of comparison, many games on a PlayStation 3 have “de-tuned” graphics quality which are locked to 30 fps.
If you have been gaming on an iPad2, then this world, where products come packaged in a box featuring a picture of some kind of death knight mounted on a horse against a background of orange lightning bolts, might be weird and scary. There is no “less is more” mentality among the hard core. But the innovation is interesting too – so read on.
The silicon story We’re going to skip over the story about the huge power requirements and the need for advanced cooling and get to the silicon story, where we will touch a bit on the massive processing capabilities in this latest AMD 7970 chip. But first, some of the design wins, starting with some of the peripheral chips cataloged:
Silicon Laboratories SL16010DC clock generator
CHiL Semiconductor power management controller
Fairchild NC7SZ74K8X MOSFET
Fairchild FDMC8200 N-channel FET
OnSemiconductor MC74VHCT125H MOSFET
Coiltronics 1007R3-R15 inductors (6)
The silicon story (cont.)OnSemiconductor MC78M05CDTG voltage regulator
Programmable Microelectronics Corp Pm25LD010 serial flash memory
The Hynix DRAM The core functionality is in the memory and processor. Surrounding the graphics chip are 12 256 MB GDDR5 chips, for a total of 3 GB graphics RAM memory. The part number is the Hynix H5GQ2H24MFR GDDR5, which is a 2 Gb device rated at 6 GBps at 1.6 V. Twelve of them are used to give a 384 bit memory bus and memory bandwidth of 264 GB/s. The x-ray shows that they are single die 2 Gb chips, as opposed to 2 x 1 Gb, which would have been common a few months ago.
At the right we are showing the full die. Die size indicates that it is fabbed in Hynix’ 44-nm process, and it has the usual square block-layout format and double row of bond pads typical of a graphics DRAM, to give the higher data rates. If you are a DRAM manufacturer then you may want the full resolution version available for free here in the Chipworks Store (but you do need to go through a check out process).
Data rates are critical in a graphics board like this, so we have also shown a board image adjusted to show the tracks between the Radeon GPU and the memory chips; you can see the various routes laid out to keep the parasitic values equivalent for each separate chip so that they perform equally.
The AMD Radeon 7970 The AMD Radeon 7970 (Tahiti) is a flagship device by AMD because it is the first commercially available graphics processor fabricated at 28 nm by TSMC. At the right, we are showing the top metal (not too much to see there) and a polysilicon die photo, where you can make out the digital and analog blocks. For the higher resolution required to do layout analysis, you will need to visit the Chipworks Store and order the 35 MB version.
Using the latest 28 nm technology lets AMD squeeze 2084 shaders, organized in 32 compute units, on to the die, for a total of 4.3 billion transistors in 365 sq. mm. The chip is clocked at 925 MHz, giving a theoretical performance of ~3.8 TFLOPS, compared with ~2.7 of the previous generation Radeon 6970.
The AMD 7970 – more than meets the eyeIn order to deliver serious performance, AMD also had to consider some innovative packaging. The x-ray shows what looks like a 20 layer substrate. You can see from the board images above that the heat spreader is unusual, with cavities to allow direct contact to the cooler. The flip-chip solder ball connection to the die is also a little different – the SEM shot at right shows the extra-thick under-bump metal of the solder has been removed. Look out for a package report on this one!
That would be highest res of the Tahiti die I've seen so far?
Though looks like they did less than perfect job scraping layers
ChipWorks said:At the right, we are showing the top metal (not too much to see there) and a polysilicon die photo, where you can make out the digital and analog blocks. For the higher resolution required to do layout analysis, you will need to visit the Chipworks Store and order the 35 MB version.
I doubt a bit that it is between the TMUs and the CUs. There is this typical box of the SIMD engines (where the logic is surrounded by the SRAM arrays for the reg files) on both sides of the shared stuff (which would mean each of these blocks is comprised of two SIMD engines). My guess for the TMUs would be that they are all oriented to the middle of the die, not the edges (and the structures there look a bit like the RV770 TMUs). The center of a CU would be a natural place for the scalar ALU, the LDS and the scheduling stuff in my opinion. Two SIMD engines are then located to either side of that. It appears a bit strange that AMD would also put the shared logic/caches for the CU groups in the middle of the CUs. But who knows? Maybe the close location of the scalar and instruction caches is necessary for the latency they shooted for. I don't know.The multiprocessors are obviously grouped in quads -- shared scalar and instruction caches, incl. some logic, are clearly visible between the TMUs and the compute units.
Finally!jaredpace said:Might be a repost, but I didn't see it.
Someone wanna draw coloured box outlines for what's what?