D
Deleted member 13524
Guest
I think this might mean that AMD only got silicon back 2 months ago. I've read somewhere that Polaris would show up in mid-2016.
You're probably right, I misread it.
I think this might mean that AMD only got silicon back 2 months ago. I've read somewhere that Polaris would show up in mid-2016.
Mortal Kombat - it's not about death, but life.The only solution to this tense nonsense is a fight to the death!
AMD's slides say mid 2016I think this might mean that AMD only got silicon back 2 months ago. I've read somewhere that Polaris would show up in mid-2016.
After the sweetspot the process is effectively in "runaway". The gradient on the FinFET curve at its top-most power is steeper (i.e. less power per unit of frequency) than the 28nm curve.AMD chose to end the FinFET curve where it did for some reason,
I think it's probably fair to say that AMD has been running its 28nm GPUs further along the curve than NVidia. Any clawback that AMD benefits from with the higher sweetspot on FinFET is one that benefits NVidia at least equally. And, arguably, NVidia gains even more - as NVidia is running at typically 20% higher clocks (at least in portions of the GPU, who knows if clocks across the entire GPU are 1:1). The gentler gradient after sweetspot might take NVidia to 25% higher clocks for example.which might be something to look out for later.
CPUs can push further because practically none of the die is active at Fmax, especially in single-threaded work. Heavy multi-threaded AVX-512 usage leads to massive amounts of throttling in Intel CPUs.At least for CPUs that can push things further, the margin of improvement does reduce at least somewhat as the curves flatten out further.
2x transistor costs real money (AMD profit margin) in terms of dies per wafer. Relative Power costs AMD's users money. If AMD can persuade them to buy...Somewhere around .5-.7, there's a point where AMD could spend 2x the transistors at 2/3 the overall FMax, and given where 28nm reaches those speeds it might be a tie or a small win to FinFET in absolute power. That's about 4 units on the Y axis out of maybe a reasonable max of 6 for 28nm--when I think there is evidence from other GPUs that AMD has had trouble sustaining or getting much benefit from that portion.
The frame cap should at least mean that both GPUs are running at their max frequency.AMD might be taking advantage of the frame cap, if it can drop the GPU into the portion of the FinFET curve below 28nm's minimum, which might be lost if the frame rate were allowed to vary more.
AMD's specified engine clock is its boost clock. AMD clocks only go in one direction, down, and are never guaranteed.Now, if 850Mhz is the "base clock" then indeed base clock to base clock the 950 has a 21% clockspeed advantage. But if we're going by fastest clock speeds that's a 40% clockspeed advantage for the 950 from using AMD's card as a base.
I strongly suspect there's rather more than double the transistor density. Spending transistors on better performance through architecture is completely normal in GPUs. Spending transistors to compensate for low clocks (witness 28nm) and still losing on performance per watt lead to AMD comprehensively losing at 28nm (no matter now many 7970s were sold for bitcoin mining).We also haven't a clue as to how many transistors are actually in the AMD card, as the only report so far is that the "card looks really small" and even if you could measure it with a tape measure, that wouldn't tell you the specific average transistor density used for this particular GPU. "Double the transistor density" is only a guideline for if you go straight transistor density with no regard to clockspeed/power draw.
I have not disputed that it is higher. It's just that the power point's improvement over the prior node is not in same order of magnitude as earlier in the curve, and it starts to become less steep before the 1 unit mark. I consider this to be a power-limited scenario and so tend to make comparisons at the same power points, unless AMD decides to leave performance on the table.Additionally, the sweetspot is at a higher frequency for FinFET, which is precisely what everyone wants and expects.
I agree that that is after the sweetspot, but I wouldn't consider the point of a curve just short of runaway to be the ideal.After the sweetspot the process is effectively in "runaway".
Quite possibly, I have only been referencing AMD's claims for the prior and upcoming process.Any clawback that AMD benefits from with the higher sweetspot on FinFET is one that benefits NVidia at least equally. And, arguably, NVidia gains even more - as NVidia is running at typically 20% higher clocks (at least in portions of the GPU, who knows if clocks across the entire GPU are 1:1). The gentler gradient after sweetspot might take NVidia to 25% higher clocks for example.
And even they generally draw the line where the FinFET process would have become a significant regression versus if it were planar.CPUs can push further because practically none of the die is active at Fmax, especially in single-threaded work. Heavy multi-threaded AVX-512 usage leads to massive amounts of throttling in Intel CPUs.
Power in a power-limited regime determines performance and what parts of the market it can address, which is a significant factor in how many customers AMD can entice and at what price relative to competitors and the now discounted prior generations.2x transistor costs real money (AMD profit margin) in terms of dies per wafer. Relative Power costs AMD's users money. If AMD can persuade them to buy...
The 850E figure AMD gives for clocks sounds like it's a case of a case of a target that PowerTune is operating around.The frame cap should at least mean that both GPUs are running at their max frequency.
I already posted some numbers here - https://forum.beyond3d.com/posts/1889628/Maybe someone out there can put together some 28nm metrics for Battlefront Medium settings on this training mission in terms of performance per mm² and performance per watt. And be careful to use a mixture of Maxwell and Maxwell v2 GPUs to compare with Tonga.
I'm talking about the implications for enthusiast discrete: utterly miserable. 30% more performance from the node change, before architectural improvements, after 5 years is just horrible.
Ryan Shrout said:AMD’s Joe Macri stated, during our talks, that they expect this FinFET technology will bring a 50-60% power reduction at the same performance level OR a 25-30% performance increase at the same power. In theory then, if AMD decided to release a GPU with the same power consumption as the current Fury X, we might see a 25-30% performance advantage.
Both systems were capped to 60 FPS with v-syncI already posted some numbers here - https://forum.beyond3d.com/posts/1889628/
In my experience, X-wing training has higher framerate than most of lanscape missions, so avg framerate for reference 950 is most likely around 94 fps
The slide has error in it with the memories, their video about it says correctly DDR3.There's something wrong with the system description. It says "Core i7 4790k" with 4x4 DDR4 2600. That's just not possible as Haswell and the Z97 boards don't support DDR4.
Furthermore, the numbers don't add up. A supposedly lower-power system with the GTX 950 was found to consume close to 160W in Dragon Age Inquisition (same FrostBite 3 engine).
I know, but I wonder why some people here have already decided that demonstrated Polaris GPU is match to GTX 950 in uncapped performanceBoth systems were capped to 60 FPS with v-sync
I know, but I wonder why some people here have already decided that demonstrated Polaris GPU is match to GTX 950 in uncapped performance
I think you're right and I've just over-reacted. What you've described is what has happened with node transitions historically.What he is forgetting to take into account is the die size. If AMD just shrunk Fury X, theoretically, it would be < 300mm2 at <150w. If they wanted to make that < 300mm2 die use ~250-275w, they could just increase frequency by 30% and call it a day.
But, has either Nvidia or AMD ever done that in the past? No, they typically try to find the sweet spot in regard to both metrics. Like if they find they can reduce power per transistor ~40-45% but also increase frequency ~10-15% which would still allow them to yield a near doubling of transistor count.
~16b trannies, ~500-550mm2, ~1.1-1.2ghz, <275w
That should be good for close to 2x over Fury X performance.
What he is forgetting to take into account is the die size. If AMD just shrunk Fury X, theoretically, it would be < 300mm2 at <150w. If they wanted to make that < 300mm2 die use ~250-275w, they could just increase frequency by 30% and call it a day.
But, has either Nvidia or AMD ever done that in the past? No, they typically try to find the sweet spot in regard to both metrics. Like if they find they can reduce power per transistor ~40-45% but also increase frequency ~10-15% which would still allow them to yield a near doubling of transistor count.
~16b trannies, ~500-550mm2, ~1.1-1.2ghz, <275w
That should be good for close to 2x over Fury X performance.
There's some problems here, specifically it's that with the new Finfet nodes you don't get both a 2x transistor per area shrink and an improvement in clockspeed for power usage. You could, theoretically, shrink a Fury/Titan X down to 300mm, but you'd get around similar power usage as before. A tradeoff will need to be made, and only Nvidia and AMD know what tradeoffs they've chosen for the moment.
That would require zero power scaling from going to the FinFET node. Poor (still non-zero) scaling in terms of power/transistor was why 20nm was so disappointing and why FinFET has been so eagerly awaited. They do get the density increase and improve power per transistor, due to the structural change of the devices.