3D_world, you're out of your element. I hadn't had a good laugh for the day, so thanks for your laughable post.
Indeed... his nonsense posts brought a smile to my face....
3D_world, you're out of your element. I hadn't had a good laugh for the day, so thanks for your laughable post.
Thanks, I am beginning to understand the advantages of the bottom logic of HMC as a standard part of the memory itself (defect mapping is a pretty good one, they could even do ECC for servers, it can translate the interface to any PHY in the future). Is it plausible that the dram layers of HMC are practically the same thing as HBM, and cost the same to produce? I'm curious if one has a pricing edge over the other depending on the application. I thought HMC would necessarily cost more, but if HMC allows defect mapping, the better yield for large capacity chips would easily pay for the price of the bottom logic layer, and any logic in there would save the cost of having to implement it in the SoC. (memory being so much higher volume than an SoC, any logic there would cost less)HBM seems to be a more straightforward port of DRAM to a stacked interposer format.
The largest apparent difference that I can see is that there isn't a layer of logic at the bottom of the DRAM stack.
HMC provides controller logic and high-speed bidirectional links. A lot of other possibilities for topology, DRAM technology, and simplification of CPU and GPU memory controller logic can come with HMC.
It's possible the on-stack logic can affect some form of repair or defect compensation that HBM may not.
HBM seems to be okay if focusing on a limited subset of capabilities using existing DRAM tech, while HMC seems to offer more applicability, expandability, and future development.
Sesh Ramaswami, managing director at Applied Materials, showed a cost analysis which resulted in 300mm interposer wafer costs of $500-$650 / wafer. His cost analysis showed the major cost contributors are damascene processing (22%), front pad and backside bumping (20%), and TSV creation (14%).
[..]
Since one can produce ~286 200mm2 die on a 300mm wafer, at $575 (his midpoint cost) per wafer, this results in a $2 200mm2 silicon interposer.
GPUs have had read only caches for quite some time and read/write caches are available from AMD and Nvidia.Actually, what is the nature of the GPU? It is mostly SIMD computation logic. No cache, little branch prediction, very little instruction re-ordering. Thus it's exceptionally bad for anything except 3D Math. For instance Ray Tracing - you've got to go through all the vertices in a scene to see what the bullet intersects with, for each bullet. Hard to do without a cache! The cache greatly accelerates it - but storing it on the chip. Thus the Power7 is ideal, with up 80 MB of eDRAM cache.
HBM seems to be a more straightforward port of DRAM to a stacked interposer format.
The largest apparent difference that I can see is that there isn't a layer of logic at the bottom of the DRAM stack.
HMC provides controller logic and high-speed bidirectional links. A lot of other possibilities for topology, DRAM technology, and simplification of CPU and GPU memory controller logic can come with HMC.
It's possible the on-stack logic can affect some form of repair or defect compensation that HBM may not.
If anyone thought a 2.5D interposer solution would be cost prohibitive... I think it's looking much better today
http://www.electroiq.com/articles/ap/2012/12/lifting-the-veil-on-silicon-interposer-pricing.html
Ray tracers are using space partitioning, I tried to find a very old one that wasn't and I failed.For instance Ray Tracing - you've got to go through all the vertices in a scene to see what the bullet intersects with, for each bullet.
HBM, which looks like a higher-end variant of a future Wide I/O using interposers and TSVs, doesn't make mention of a repartitioning of the internal DRAM arrays.Thanks, I am beginning to understand the advantages of the bottom logic of HMC as a standard part of the memory itself (defect mapping is a pretty good one, they could even do ECC for servers, it can translate the interface to any PHY in the future). Is it plausible that the dram layers of HMC are practically the same thing as HBM, and cost the same to produce?
I'm curious if one has a pricing edge over the other depending on the application. I thought HMC would necessarily cost more, but if HMC allows defect mapping, the better yield for large capacity chips would easily pay for the price of the bottom logic layer, and any logic in there would save the cost of having to implement it in the SoC. (memory being so much higher volume than an SoC, any logic there would cost less)
Thanks for the information, would an HMC interposer make sense as a dedicated I/O processor chip to make process shrinks of either the separate CPU/GPU or the APU simpler? I'm wondering about a good chip interconnect layout for using HMC and it seems to make sense from my limited perspective.
Have you never heard of GPGPU? nVidia GPUs that accelerate physics? Or all sorts of other things? I suggest you take a moment to catch up with the technology of the 3rd millennium.The GPU - no cache, no ability to analyze for dependencies, no out-of-order instruction execution, thus making it an order of magnitude slower for any code, except extremely simple SIMD (3D math).
I don't know what they intend to change with HBM compared to wideIO if anything (I mean other than frequency, DDR pumping, and width). Right now each memory bank has it's own complete 128bit interface, including addressing, control, power/ground, etc... WideIO1 is 4 banks per chip and HBM will be 8 banks. So 2 chips on an AMD GPU would mean 16 banks that can be accessed concurrently. I think they said the stacking is simply additional banks that are accessed like a bog standard memory, so vertical layers are only adding capacity. Read/write concurrency is proportional to the number of channels, as they are managed individually by the SoC, I think the Cadence Wide-IO controller is literally one independent controller per channel.Without knowing more about what burst lengths and how many channels make up that 1024 IO number for HBM, it's not clear how much worse it is than HMC, and on what workloads
Here is AMD china empolyee use some old news to hint next-gen xbox GPU,well,and keyword of next-gen xbox CPU
http://club.tgfcer.com/thread-6586999-1-1.html
Can anyone explain a bit what he says?
Sadly, there are not good Chinese>Englesh web translate services. With Japanese translations you can usually get the meaning, but with Chinese... not much.
In GAF they are commenting the OP of that thread hints to a 8 Jaguar cores at 1.6 Ghz + 76xx GPU in the APU + discrete GPU of the 8800 series. It seems he claims it is Durango and the 8800 is the new 8870 with almost 4 TFLOPs.
Well that thread only talking about Durango,he didn't mention about PS4
Why i know that?because i know chinese...wait,i'm chinese
In GAF they are commenting the OP of that thread hints to a 8 Jaguar cores at 1.6 Ghz + 76xx GPU in the APU + discrete GPU of the 8800 series. It seems he claims it is Durango and the 8800 is the new 8870 with almost 4 TFLOPs.