AMD: Speculation, Rumors, and Discussion (Archive)

Status
Not open for further replies.
Interesting undervolted, stock and overvolted numbers here for 390.

Performance numbers for unigine valley,

69.3 fps, 2898 score, 38.2 min, 122.4 max @said undervolt; stock speed.
71.2 fps, 2980 score, 37.6 min, 130.0 max @stock voltages; stock speed.
77.9 fps, 3259 score, 38.2 min, 138.0 max @said power bump; 1200core, 1700mem.

Power usage for the same:
psvmUni.jpg

More here,

http://forums.anandtech.com/showpost.php?p=37723466&postcount=9
 
One thing to note about undervolting is that generally we would assume the manufacturers have a reason for setting the voltages, given they have the most low-level and comprehensive tools for gauging the properties of the chip and its reliability.
Perhaps there is a complicating factor that can make this more conservative than it should be, but I do not see a motivation for upping the voltage if there were not a case for increased reliability across a more exhaustive set of tests and a customer base with varying quality rigs.
The 7970 GHz edition came about with an admission that the original characterization and estimates were more conservative than necessary, so it isn't impossible that such a thing can happen.
 
The quality of the power supply on 6/8-pin connectors can vary quite a bit from PSU to PSU, right? This might be a significant part of the equation.
 
Ripple quality certainly plays a part in how the chips would perform, yes.
 
One possibility is that HBM1 would not be cheaper than HBM2. The limited disclosure of the planned future products show the sorely needed stack, density, and somewhat less-needed performance increases come with HBM2, making HBM1 less desirable for wider deployment.
HBM1 is already a capacity demerit for the high-end, and with the decision to bump GDDR5 GPUs in the 3xx range to double capacity, it is less likely that this can be walked back.

The descriptions of the HBM1 we know of indicate that the DRAM dies are 100um thick, and a reason why that might be noteworthy is that this is twice as thick as stacked memory research has projected. HBM1 might not be manufactured with a process that was not intended to be the final mass deployment of the tech. If HBM2 has the denser chips, higher stacks, and updated protocol, then it would have broader appeal and better economies of scale. At that point, just have a 4-Hi HBM2, not that there seems to be a plan to make HBM in any form cheap enough next year for mass deployment.

GDDR5X, if it rolls out, poses more of a threat to HBM1--which was not a decisive victory against GDDR5.
 
I'd say the mere fact that you can get as much capacity as you need from a single stack of HBM2 (for all but the biggest GPUs) makes it a much better proposition than HBM1 for mainstream products.
 
GDDR5X, if it rolls out, poses more of a threat to HBM1--which was not a decisive victory against GDDR5.
True, a few months back when reviews rolled out my first thoughts were HBM1 results were somewhat underwhelming.
 
I'd say the mere fact that you can get as much capacity as you need from a single stack of HBM2 (for all but the biggest GPUs) makes it a much better proposition than HBM1 for mainstream products.

Exactly, I'm not sure of the costs, but 1 stack of HBM2 + interposer must be at worst comparable in price to 8 GDDR5 chips + increased PCB complexity. You're also using less power which must save costs elsewhere or increase perf/watt/$. Availability is the biggest question in my mind.

GDDR5X may make some sense in the upper mid-end to lower high-end, but I'm not sure about the pure mid range or the bleeding edge upper-end.
 
Yap, a mid-range card could perfectly house a single HBM2 stack of 4GB for 256GB/s.
I could hardly see that being more expensive than 4 chips of GDDR5X with a 128bit bus which will start at 10Gbps (according to those slides), so maximum 160GB/s for 2016.

IMO GDDR5X will only be used where the IHVs couldn't put their hands on enough HBM2 (e.g. nVidia's whole range except GP100) and lower-end solutions where 2GB of VRAM is sufficient, so they can use two GDDR5X 8Gbit chips only (e.g. a Cape Verde or GM107 successor).
 
Last edited by a moderator:
Making HBM2 should be a bit more expensive than making GDDR5 due to the more complex process, including the assembly, but ultimately it will probably be down to economies of scale more than anything else. If GPU and RAM IHVs commit to it quickly and ramp up volumes very fast, it could be quite cheap and competitive, especially if GDDR5 volumes go down at the same time, which you'd expect. But it's not clear that this is easily doable.
 
If GDDR5 volumes drop, one of the biggest reasons would be that GDDR5X supports the GDDR5 speed bands and burst lengths as part of its goal of reusing as much of the GDDR5 ecosystem as possible.

There's something of a parallel between GDDR5/GDDR5X and HBM1/HBM2 in that both promise an eventual doubling of bit rate, generally use the older version's protocols, have a "legacy" mode to act like the older version, and double the burst length.
It might come down to a similar question about HBM1 versus HBM2, where there's a newer and faster version that does all the things the old one can do by trying half as hard.

GDDR5X seems to be a bit more restrictive in that the slides show it must use a burst length of 16, whereas HBM2's slides make it sound like BL 4 is optional. However, it may turn out that this option is in practical terms similar to how GDDR5X supports BL 8 when operating in GDDR5's range.
 
Yap, a mid-range card could perfectly house a single HBM2 stack of 4GB for 256GB/s.
I could hardly see that being more expensive than 4 chips of GDDR5X with a 128bit bus which will start at 10Gbps (according to those slides), so maximum 160GB/s for 2016.

IMO GDDR5X will only be used where the IHVs couldn't put their hands on enough HBM2 (e.g. nVidia's whole range except GP100) and lower-end solutions where 2GB of VRAM is sufficient, so they can use two GDDR5X 8Gbit chips only (e.g. a Cape Verde or GM107 successor).

Isn't the memory bus for HBM different from GDDR though? I doubt you'd want 2 different memory buses for the GPU just to support 2 different memory types. So I could see the lowest end AMD card have GDDR5/X on a 128bit bus. The "low" end bin could just be 2gb-4gb GDDR5 while the "high" end bin could get GDDR5x. Heck with all 3 GPUs taped out maybe they were planning on a 7ghz/128bit bus for the high end bin anyway like what Nvidia does with the 960, and then GDDR5x comes along and just makes it better.
 
I didn't suggest that the same chip would support both memory types. Only that 2 GDDR5X chips (+ PCB area and complexity) can be cheaper than 1 HBM2 stack + interposer, so the bottom-end GPUs from both nVidia and AMD (on Pitcairn's performance level?) would probably have those.
 
With the introduction of GDDR5X, I'll be very surprised if we see HBM on even midrange before 2020.

I don't understand the way people are just waiving aside cost issues. There is no way that interposers, die thinning, and stacking will ever be cheaper than old school PCB stuff. The former is inherently more complex. (And lower volume.)

It will come down in cost, but so will PCB technology. And GPU PCBs don't come close in complexity to cell phones (except for MXM-like stuff, I assume.)
 
With the introduction of GDDR5X, I'll be very surprised if we see HBM on even midrange before 2020.

I don't understand the way people are just waiving aside cost issues. There is no way that interposers, die thinning, and stacking will ever be cheaper than old school PCB stuff. The former is inherently more complex. (And lower volume.)

It will come down in cost, but so will PCB technology. And GPU PCBs don't come close in complexity to cell phones (except for MXM-like stuff, I assume.)

The die-stacking part may not be so clear-cut for future density increases.
DDR4's point-to-point bus has lead to die stacking in order to up capacity.
https://www.chipworks.com/competiti...y-reports/recent-reports/samsung-3d-tsv-based

GDDR5X, being newer than die-stacked HMC, HBM, and DDR4, might have the option available if for some reason the interface can't abstract it away.
It might be that HBM's interposer would be the biggest stumbling block if GDDR5X starts to stack.
 
I've been wondering about something. AMD potentially has more to gain from the next node transition (next gen) than Nvidia. Considering Nvidia have reduced transistor count for DP in order to boost performance for SP and AMD hasn't done the same, or at least not to nearly the same extent. There's also low hanging fruit there for better color compression.

It makes me wonder if AMD will continue to go with a lot of DP performance for the top end, or if they'll follow in the footsteps of Nvidia and reduce that in order to boost performance in games and things that don't require DP?

Then again with a new node transition and more transistors to play with, will Nvidia go back to higher DP performance for the professional market?

Regards,
SB
 
Status
Not open for further replies.
Back
Top