AMD: Volcanic Islands R1100/1200 (8/9 series) Speculation/ Rumour Thread

fellix · Sep 22, 2013

CarstenS said:
Hawaii = 2x Pitcairn-on-a-die? Looks more and more likely.

Definitely, which points to 4 setup pipes and 64 ROPs in Hawaii.

no-X · Sep 22, 2013

I believe they traded some of the ROPs for advanced GPGPU/HPC features.

Mintmaster · Sep 22, 2013

rapso said:
yet,if your competitor achives nearly your speed with just 256bit and beats you by 30% with the same 384bit, you might consider it's not the bandwith that causes it.

You're missing the point. Just because a competitor can nearly match you with 256-bit doesn't mean that 384-bit or 512-bit is useless. They would be even faster with it. It's a matter of cost/benefit.

Designing a 512bit controller to get to get more memory sounds somehow wrong,

That's not what I'm saying. The cost savings of a smaller bus aren't just in the GPU. You also have fewer chips, fewer traces, simpler PCB, etc. But if the standard becomes 4-8GB, some of those savings are gone. Last gen NVidia probably got substantial mariginal savings from having 256-bit in its GTX 680 instead of 384-bit or 512-bit, but a 4-8GB 256-bit card is probably not much cheaper than a 4-8GB 512-bit card.

For example: 10% performance boost for board cost going from $200 to $250 probably doesn't make sense, but 10% boost for board cost going from $300 to $330 probably does.

UniversalTruth · Sep 22, 2013

no-X said:
I believe they traded some of the ROPs for advanced GPGPU/HPC features.

Yup, some of them would be as many as 20, from 64 down to 44

rapso · Sep 23, 2013

Mintmaster said:
You're missing the point. Just because a competitor can nearly match you with 256-bit doesn't mean that 384-bit or 512-bit is useless. They would be even faster with it. It's a matter of cost/benefit.

my point was, that having a competitor for comparison that shows you can achieve 30% more with 384bit doesn't make it sound wise to increase the width by 33% if you look at the die shots, which show the controller being a huge part and does not benefit from shrinks (if they'd want to migrade to 20nm in an tick-tock fashion as intel does), the efficiency doesn't scale linearly.

That's not what I'm saying. The cost savings of a smaller bus aren't just in the GPU. You also have fewer chips, fewer traces, simpler PCB, etc. But if the standard becomes 4-8GB, some of those savings are gone. Last gen NVidia probably got substantial mariginal savings from having 256-bit in its GTX 680 instead of 384-bit or 512-bit, but a 4-8GB 256-bit card is probably not much cheaper than a 4-8GB 512-bit card.

I'd think the cheapest solution is probably right in the middle, 6GB with 384bit.
-you have the same chips you'd use in your 512bit cards
-you have a cheaper controller than 512bit
-it's already working for years, no need to redesign it
-you match pretty exactly what consoles offer (6GB if I recall correctly various news), if that's what their goal was as you think (personally I doubt a 1:1 match makes any sense, PCs aim now for 4k screens, you need to scale the content accordingly and that' rather 32GB, IF console memorize sizes are in any way considered).

For example: 10% performance boost for board cost going from $200 to $250 probably doesn't make sense, but 10% boost for board cost going from $300 to $330 probably does.

exactly, and that's why it's surprising if they'd increase the controller to 512bit. adding some transistors to rise efficiency is probably cheaper than increasing the controller. the controller needs to be aligned to the external power demands.
I'm not an expert of that area, but I was told that the IO part of a chip is the one that barely reduces in die size, power consumption -> cost, that's why we had 384bit GPUs when the last console generation arrived and that's why today's fastest GPUs have 384bit (which may change in a few days, of course).

DuckThor Evil · Sep 23, 2013

rapso said:
exactly, and that's why it's surprising if they'd increase the controller to 512bit. adding some transistors to rise efficiency is probably cheaper than increasing the controller. the controller needs to be aligned to the external power demands.
I'm not an expert of that area, but I was told that the IO part of a chip is the one that barely reduces in die size, power consumption -> cost, that's why we had 384bit GPUs when the last console generation arrived and that's why today's fastest GPUs have 384bit (which may change in a few days, of course).

Well Dave was pretty strongly hinting that per 128bit section a memory controller with slower mem speed will be quite a bit smaller than a fast one. Someone calculated that Tahiti's mem controller is bigger than 2 X Pitcairns so going with a slower 512bit can save you die space compared to a fast 384bit controller. With 384bit they would have needed to be at the bleeding edge of mem speeds, possibly having to do a lot extra work on the controller whereas now they can go with slower chips and probably have easier job with the controller.

LordEC911 · Sep 23, 2013

Dr Evil said:
Well Dave was pretty strongly hinting that per per 128bit section a memory controller with slower mem speed will be quite a bit smaller than a fast one. Someone calculated that Tahiti's mem controller is bigger than 2 X Pitcairns so going with a slower 512bit can save you die space compared to a fast 384bit controller. With 384bit they would have needed to be at the bleeding edge of mem speeds, possibly having to do a lot extra work on the controller whereas now they can go with slower chip and probably have easier job with the controller.

Yes. ~50% the size.

Dave Baumann said:
Per 128b chunk the Pitcairn PHY is about half the size and Tahiti has 3 of them.

So a 512b interface made with Pitcairn controllers would be ~66% the size of the 384b Tahiti.
I would imagine it is more complicated than just slapping 4x Pit's MCs on there, so that ~66% is probably a bit generous.

On a related note, a smaller PHY driving lower speeds also probably lowers power consumption as well as the GDDR5 ICs.

gkar1 · Sep 23, 2013

rapso said:
my point was, that having a competitor for comparison that shows you can achieve 30% more with 384bit doesn't make it sound wise to increase the width by 33% if you look at the die shots, which show the controller being a huge part and does not benefit from shrinks (if they'd want to migrade to 20nm in an tick-tock fashion as intel does), the efficiency doesn't scale linearly.

You really have no point because you're barking up the wrong metric. They achieved 30% more performance but they had to use a die with 60%+ more transistors. That hardly seems worth it if the leaked performance figures are true.
.

cal_guy · Sep 23, 2013

rapso said:
my point was, that having a competitor for comparison that shows you can achieve 30% more with 384bit doesn't make it sound wise to increase the width by 33% if you look at the die shots, which show the controller being a huge part and does not benefit from shrinks (if they'd want to migrade to 20nm in an tick-tock fashion as intel does), the efficiency doesn't scale linearly.

In the past 5 years AMD have produced GPU that have memory width of 64, 128, 256, 384, and 512-bit, while using DDR2/3 and GDDR3/4/5. I don't think you have to worry about AMD on this point.

tunafish · Sep 23, 2013

LordEC911 said:
I would imagine it is more complicated than just slapping 4x Pit's MCs on there

I don't think it would be, actually. The 64-bit memory controller blocks are effectively independent.

UniversalTruth · Sep 23, 2013

I would suggest they go straight to the 500-550$ price tag instead of the suggested 650$ and I would bet then slowly decreasing it in the next weeks.
This would fix the price relatively stable, give some people the choice to buy two of them for ~1000$ as they want, and most importantly- will give a very positive light on the product, better than if it is priced 150$ higher...

rapso · Sep 23, 2013

Dr Evil said:
Well Dave was pretty strongly hinting that per 128bit section a memory controller with slower mem speed will be quite a bit smaller than a fast one. Someone calculated that Tahiti's mem controller is bigger than 2 X Pitcairns so going with a slower 512bit can save you die space compared to a fast 384bit controller. With 384bit they would have needed to be at the bleeding edge of mem speeds, possibly having to do a lot extra work on the controller whereas now they can go with slower chips and probably have easier job with the controller.

the power efficiency increases, the real world efficiency reduces, I wonder at what ratio that balances (or rather at what curve).

gkar1 said:
You really have no point because you're barking up the wrong metric. They achieved 30% more performance but they had to use a die with 60%+ more transistors. That hardly seems worth it if the leaked performance figures are true.

you're ignoring that they had achived the same speed (in games) with a Die size of 294mm (vs 365mm I think) and 3.54Billion Transisitors (vs 4.3BTransistors).
I admit the comparison with the GK110 is not fully fair for various reasons, but I rather wanted to point out that there is quite some room for more performance with 384bit and GK110 is the proof.

That's why I said it would be important to know what AMD is aiming for, with Pitcairn they've shown they know how to create an efficient gaming GPU.
if they strip down the memory interface to be Pitcairn alike (tho, I have no idea what compute features beside maybe ECC made the difference), I'd expect now the same happened to the other parts of the GPU. It might be an awesome gaming focused GPU.

@cal_guy
I'd like to reply, but I don't get quite your point. worry about what?

Wynix · Sep 23, 2013

UniversalTruth said:
I would suggest they go straight to the 500-550$ price tag instead of the suggested 650$ and I would bet then slowly decreasing it in the next weeks.
This would fix the price relatively stable, give some people the choice to buy two of them for ~1000$ as they want, and most importantly- will give a very positive light on the product, better than if it is priced 150$ higher...

That does not make any sense; they would earn less.
Also i doubt the price will decrease much for at least the first couple of months.

Gipsel · Sep 23, 2013

rapso said:
if they strip down the memory interface to be Pitcairn alike (tho, I have no idea what compute features beside maybe ECC made the difference),

Simply the speed. 5GBps vs. 6+GBps makes obviously a hell of a difference. Dave was talking about the PHYs (which take about 20mm² per 128 bit section on Tahiti), these parts don't even know about ECC (that is done higher up in the actual DRAM controller I suppose).

Btw., I would actually expect a 512Bit interface with the same aggregate bandwidth as a faster 384bit interface to be slightly more efficient/higher performance on average, even if it is just the higher number of open pages possible with this setup.

UniversalTruth · Sep 23, 2013

Wynix said:
That does not make any sense; they would earn less.
Also i doubt the price will decrease much for at least the first couple of months.

Ok, if I and thousands other people tell you that we are not going to buy anything unless it is at this exact price, how would you feel? Do you think it would impact your sales numbers or you go straight with your horns no matter what the clients demand?

rapso · Sep 23, 2013

Gipsel said:
Simply the speed. 5GBps vs. 6+GBps makes obviously a hell of a difference.

wasn't that obvious to me, sorry :/. <20% clock gain forced AMD to have a +50% bigger die area? why made that sense in the first place at all?

Dave was talking about the PHYs (which take about 20mm² per 128 bit section on Tahiti), these parts don't even know about ECC (that is done higher up in the actual DRAM controller I suppose).

I thought the DRAM controller is the actually last stage on a chip between the chip and the memory that costs Die space independently of any process shrink and it was why increasing bit width was rather expensive.

Btw., I would actually expect a 512Bit interface with the same aggregate bandwidth as a faster 384bit interface to be slightly more efficient/higher performance on average, even if it is just the higher number of open pages possible with this setup.

yes, it would be more efficient if you'd utilize the address-space equally, but you are also fragmenting the address/memory space. I'd think the granularity of the address space is 4k? (is that hard wired or can AMD polish that with never drivers?)

entity279 · Sep 23, 2013

rapso said:
wasn't that obvious to me, sorry :/. <20% clock gain forced AMD to have a +50% bigger die area? why made that sense in the first place at all?

Forced? It's a matter of trade-offs, as always.

Extra features (the mentioned ECC and possibly others), possibly more redundancy & yield optimizations and speed could all have accounted for that.

The bottom line is that what makes sense now that we have a full range of SKUs on the market is different from what made sense back then when they were designing the chip. It's not as trivial to rationalize the latter using the former as premises

kalelovil · Sep 23, 2013

rapso said:
wasn't that obvious to me, sorry :/. <20% clock gain forced AMD to have a +50% bigger die area? why made that sense in the first place at all?

Perhaps the underlying elements of Tahiti's memory controllers were reused from Cayman's design for time-to-market reasons, and thus haven't benefited from advancements to efficiency since then.
This is just my speculation though.

Gipsel · Sep 23, 2013

rapso said:
I thought the DRAM controller is the actually last stage on a chip between the chip and the memory that costs Die space independently of any process shrink and it was why increasing bit width was rather expensive.

The actual memory controller should shrink, the PHYs (that's the part usually marked as memory interface on die shots) not. The PHYs are responsible to create the actual signals on the external pins. They need to be able to drive much more current at very high frequencies than the internal on die connections. That's why they need to be relatively large (and tend to grow with higher frequencies). But the PHYs are not en-/decoding the ECC information. As said, they don't really "know" what data or commands they are driving to the pins or are receiving, they just "translate" the signals from the memory controller or the memory between the lower clock speed memory controller and the high speed external interface.

rapso said:
yes, it would be more efficient if you'd utilize the address-space equally,

That's what one usually strives for and why the address space is interleaved between the memory channels.

rapso said:
but you are also fragmenting the address/memory space.

As said, it's interleaved between channels anyway.

rapso said:
I'd think the granularity of the address space is 4k? (is that hard wired or can AMD polish that with never drivers?)

The interleaving granularity is probably much smaller, could be as small as a cache line size (64 byte) or a a very small integer multiple of it.

Wynix · Sep 23, 2013

UniversalTruth said:
Ok, if I and thousands other people tell you that we are not going to buy anything unless it is at this exact price, how would you feel? Do you think it would impact your sales numbers or you go straight with your horns no matter what the clients demand?

Nice strawman argument; Anyway, all these people you speak of can just wait 2-3 months until the price comes down.

AMD have no reason to price this over $100 cheaper than the inferior 780, if you don't like it at ~$600 then buy something else.

AMD: Volcanic Islands R1100/1200 (8***/9*** series) Speculation/ Rumour Thread

Similar threads

AMD: Volcanic Islands R1100/1200 (8/9 series) Speculation/ Rumour Thread