AMD may delay next-gen Vega, confirms Zen on track for Q1 2017
http://www.extremetech.com/computin...xt-gen-vega-confirms-zen-on-track-for-q1-2017
http://www.extremetech.com/computin...xt-gen-vega-confirms-zen-on-track-for-q1-2017
AMD may delay next-gen Vega, confirms Zen on track for Q1 2017
http://www.extremetech.com/computin...xt-gen-vega-confirms-zen-on-track-for-q1-2017
If the more recent leaked slides about AMD's HPC APU are legitimate, AMD's solution to implementing two sets of memory controllers on the same chip is to not do so by using an MCM hosting separate CPU and GPU silicon.
AMD's paying GloFo $100 million to fab with someone else?
http://www.marketwired.com/press-re...t-with-globalfoundries-nasdaq-amd-2154755.htm
AMD's paying GloFo $100 million to fab with someone else?
http://www.marketwired.com/press-re...t-with-globalfoundries-nasdaq-amd-2154755.htm
The limitation would be the speed of the GMI link, so roughly 100GB/s based on some of the leaded docs a while back. That is possibly in addition to system memory bandwidth. Hard to tell what the maximum rate is as the configurations were all CPU to GPU and the links only large enough to accommodate maximum system memory bandwidth. That's still a ton of bandwidth for a CPU to play with if it works. Problem being it's not the CPU that likely needs the bandwidth. Most bandwidth intensive tasks for a CPU using HSA would be better accelerated by the GPU portion.I'd like to see that. Even 4GB of HBM would be a good start on a system that wouldn't need to use drivers to swap texture data over PCI-E. If the HPC APUs could use the HBM as a last level cache then everything would just fly, with fewer headaches than for a dGPU.
Price could be interesting as they might be able to run a system without any dram. A chromebox for example typically has 2/4/8GB configurations. So some costs might be offset by not including DIMMs or possibly even their sockets. That could make for some interesting configurations and performance.Not if you want performance out it at a reasonable price. just doesn't work that way, you need both for a lower end and midrange laptop, now if you want to go higher end, cut down Greenland just doesn't fit the bill does it?
Only one controller using GMI links for different chips based on some leaded documents.I'm assuming that AMD will already have to design two sets of memory controllers into the same chip for the HPC version, and solve any issues relating to CPU <-> GPU communication. Solutions should already have been produced and be working blocks on whichever process they're using for the HPC version. We're not talking about solving any new problems here, or even necessarily adding any new features.
Modest, but likely more than sufficient considering the typical CPU memory bandwidth. Overkill if the GPU handles acceleration for bandwidth intensive tasks.Zeppelin may not be designed to use an interposer, and the GMI bandwidth is relatively modest compared to what can be achieved with an actual interposer. I'm not sure the cheaper variants have better odds than the one destined for HPC.
Was there an agreement they had to use GF exclusively? If they can't produce Polaris fast enough to meet demand, meeting wafer requirements shouldn't be a huge concern. Article doesn't mention chip pricing either, so it's possible some of the costs are offset by cheaper production if there were yield/performance issues.It does make one wonder if GloFlo's 14 nm FF was significantly worse than Samsung's (despite sharing process tech) and/or TSMC's 16 nm FF and that this prompted AMD to amend their previous agreement in order to use another Foundry even if they have to pay money to do so.
The slide in question was found by Fudzilla, I believe and it was more recently released than the HPC APU slide posted previously. At least the code names and general feature names have been corroborated so far.
http://www.fudzilla.com/news/processors/38402-amd-s-coherent-data-fabric-enables-100-gb-s
The slide had a Greenland GPU and 2 HBM2 stacks on an interposer. As drawn, it seemed to indicate that this interposer is put onto an MCM that mounts the 16-core Zeppelin CPU separately.
Multiple GMI links connect the CPU to the GPU over the MCM substrate.
Zeppelin may not be designed to use an interposer, and the GMI bandwidth is relatively modest compared to what can be achieved with an actual interposer. I'm not sure the cheaper variants have better odds than the one destined for HPC.
It seems plausible that the CPU's GMI and PCIe connectivity would use the same physical interface, although prior to the OpenCL driver strings people were using Vega interchangeably with Greenland.
A likely implementation for this would require that Greenland have a measurably wider PCIe/GMI bus than would be necessary for a consumer GPU, and with Vega is listed separately it might not have that kind of capability.
I'm open to be pleasantly surprised, but it's possible that the engineering problems with the different memory types and CPU/GPU communication may not be addressed with Vega and consumer-space Zen. The HPC version may not be superior to the likely competition, but details are light.
The limitation would be the speed of the GMI link, so roughly 100GB/s based on some of the leaded docs a while back. That is possibly in addition to system memory bandwidth. Hard to tell what the maximum rate is as the configurations were all CPU to GPU and the links only large enough to accommodate maximum system memory bandwidth. That's still a ton of bandwidth for a CPU to play with if it works. Problem being it's not the CPU that likely needs the bandwidth. Most bandwidth intensive tasks for a CPU using HSA would be better accelerated by the GPU portion.
Only one controller using GMI links for different chips based on some leaded documents.
It might not even need that much for a consumer level product. With a 100GB/s link to system memory most textures could be stored there.I suppose the key would be using the 8/16 GB of HBM2 wisely. Perhaps transferring in data head of time, or using it as an enormous cache?
So it looks like Greenland will be self contained on an interposer. My (optimistic outlook) would be that if Greenland is available for a HPC part, it could be made available for a consumer part without any re-engineering needed.
Likely because of the memory model. There wouldn't necessarily need to be any HBM present. That may also be the case if the HBM is used like a cache with a portion of system memory the VRAM.Hum.. if Greenland is self contained, then why does the OpenCL driver recognize both Greenland, a Vega 10 and a Vega 11?
Could Greenland actually be the exact same graphics chip+interposer+HBM2 set as one of the Vegas, but the OpenCL recognizes two different GPUs because one has a 100GB/s link to the CPU (and main memory) and the other has the PCIe 3.0's 16GB/s? I recon that for HSA that could make a large difference.
Hum.. if Greenland is self contained, then why does the OpenCL driver recognize both Greenland, a Vega 10 and a Vega 11?
Could Greenland actually be the exact same graphics chip+interposer+HBM2 set as one of the Vegas, but the OpenCL recognizes two different GPUs because one has a 100GB/s link to the CPU (and main memory) and the other has the PCIe 3.0's 16GB/s? I recon that for HSA that could make a large difference.
Was there an agreement they had to use GF exclusively? If they can't produce Polaris fast enough to meet demand, meeting wafer requirements shouldn't be a huge concern. Article doesn't mention chip pricing either, so it's possible some of the costs are offset by cheaper production if there were yield/performance issues.
Going by the widely-reported LinkedIn profile details, 'Project Greenland' is Gfx IP9 i.e. post-Polaris.
That's one heck of a bypass considering what AMD spends on wafers. That compensation should also only apply to expected margins. They shouldn't need to compensate for full revenue. The scale of that payment seems inline with reasonable profits for GF for the remainder of the agreement. You'd think they could have just bought a 5 year supply of 14nm interposers that can't be screwed up. AMD spent what, 155M on fabs last year? Even with a ~40% margin that's 5 years of chips at the rate they've been purchasing them.But it allows AMD to bypass GloFo with compensation per wafer.