AMD: Navi Speculation, Rumours and Discussion [2017-2018]

Status
Not open for further replies.
Right now, I think Vega 10's biggest problem that AMD didn't see coming (or failed to achieve with GF) is that the chip just hits a huge efficiency wall above 1.4GHz.

Vega 10 has 40% more transistors than Fiji while keeping the same number of execution units, halving the number of memory channels and bringing relatively few changes other than RPM. AMD stated that most of the additional transistors were put there to drive the clocks higher, but Vega 10 only clocks 40% higher than Fiji.
It's like the chip is gaining almost nothing from the 28nm -> 14FF transition, because 1.4GHz it's practically how much a GM200 can hit within 300W.

Then we have statements from @Nebuchadnezzar saying the 14FF process was never meant to be used with voltages anywhere near Vega's default values. And pre-launch statements from AMD officials saying Vega 64 would clock above 1700MHz (it never did, did it?).


It all points to AMD engineers planning on Vega 10 being able to clock comfortably at ~1650MHz with a ~220W TDP and 0.9-0.95V. And when the production units came up they realized the cards wouldn't clock above 1400Mhz at that vcore, so they panicked, overvolted and pushed the clocks as much as they could within a sub-300W TDP, which ended up being the 1-1.1V that we saw in the final product.
The 2-3% performance that Vega 10 loses in power saving mode makes it lose a whopping 30% in efficiency, or around 80W. And it was all just to achieve GTX1080 parity, which is a bit stupid IMO because driver tweaks usually amount to much more than that.

I think you mean gain a whopping 30% in efficiency.
 
Anyone seen this?

https://wccftech.com/amd-new-major-gpu-architecture-to-succeed-gcn-by-2020-2021/
https://www.tweaktown.com/news/60760/amd-working-next-gen-gpu-navi-2020-2021/index.html

"Back in October I exclusively reported that AMD would be launching Navi during SIGGRAPH 2018, in July/August this year. At the time everyone else called me crazy and that it wasn't going to launch that early, but now I've got other sources that are backing up my claims and saying that I was right. The first Navi reveal will be a professional card and not a consumer-focused Radeon RX Navi at all, that card will launch in 2019. Then a story on WCCFTech popped up which said that "Navi will be the last GCN-based architecture" and that it will be "succeeded by brand new micro-architecture in 2020/2021 timeframe, execution of GCN successor being led by new RTG leadership". This is interesting, so I reached out to some sources to clarify a few things. First, we have no idea how much the next-gen GPU architecture (post-Navi) was influenced by now ex-RTG boss Raja Koduri (love you, Raja!), who has defected and left for CPU competitor Intel. My sources said that Vega was such a "disaster" (their words, and mine)
and that the RTG team "don't know where they're going forward" and that (gulp) Navi is going to be "just as bad" as Vega."

From what i´ve heard, the new µArch was the point where Raja Koduri, took his jecket and went out to look for a place where he could create something more straight forward in terms of efficient gaming-tech.

and well, a lot of decision-makers at AMD/RTG are happy with the cryptomining-boom, leading to ... being too satisfied with the status quo. and thats dangerous for a company yelling against incrementalism.

Binning is being used.
if activated by AMD in your driver, for specific Apps.

Right now, I think Vega 10's biggest problem that AMD didn't see coming (or failed to achieve with GF) is that the chip just hits a huge efficiency wall above 1.4GHz.

Vega 10 has 40% more transistors than Fiji while keeping the same number of execution units, halving the number of memory channels and bringing relatively few changes other than RPM. AMD stated that most of the additional transistors were put there to drive the clocks higher, but Vega 10 only clocks 40% higher than Fiji.
It's like the chip is gaining almost nothing from the 28nm -> 14FF transition, because 1.4GHz it's practically how much a GM200 can hit within 300W.

Then we have statements from @Nebuchadnezzar saying the 14FF process was never meant to be used with voltages anywhere near Vega's default values. And pre-launch statements from AMD officials saying Vega 64 would clock above 1700MHz (it never did, did it?).


It all points to AMD engineers planning on Vega 10 being able to clock comfortably at ~1650MHz with a ~220W TDP and 0.9-0.95V. And when the production units came up they realized the cards wouldn't clock above 1400Mhz at that vcore, so they panicked, overvolted and pushed the clocks as much as they could within a sub-300W TDP, which ended up being the 1-1.1V that we saw in the final product.
The 2-3% performance that Vega 10 loses in power saving mode makes it gain a whopping 30% in efficiency, or around 80W. And it was all just to achieve GTX1080 parity, which is a bit stupid IMO because driver tweaks usually amount to much more than that.

Thats the direction where Vega64 was meant for.
similar to the big efficency-gain FuryX vs. R9 Nano, wich btw. got really close to the GTX980 in efficiency.

Sometimes i think maybe "clocking the Shader-Cores lower" and "upping the Frontend" like nvidia did several years ago could be a way to better efficiency. ... meh i´m surely wrong

HardOCP did some clock vs clock FuryX vs Vega64 testing ... and well nearly nothing changed.
GCN is GCN
some clock higher, some have additonal compression-techniques, but with the status quo in software the really big changes thru innovative features on top of that bring no advantages.

maybe building for tomorrow with focus on todays software, whould be smarter.
AMD did a few mis-speculations with going for the software from "The Day after Tomorrow".
 
Last edited by a moderator:
From what i´ve heard, the new µArch was the point where Raja Koduri, took his jecket and went out to look for a place where he could create something more straight forward in terms of efficient gaming-tech.
If we are to use the rumor of a 2020/2021 time frame for the next generation or AMD's ~2020 range from its presentations, the implication of a 4-5 year rule of thumb for a new architecture is that Koduri was checked-out in ~2016 or even late 2015 at the upper end if AMD isn't shading some truth about the extent of the redesign or roadmap accuracy.
Granted, I could see why he was allowed to leave if the graphics group's effectiveness was declining that early and AMD's needed architectural cadence took a year or more hit.

The rest of the rumor concerning the designers not knowing where to go is difficult to substantiate, and it would be easy to misinterpret the context or visibility such rumor sources might have.
If it were to be given credence, that they have such a poor idea of what direction to take may have deeper meaning since GCN's evolution had a fair amount the direction and iteration rate influenced by its console customers. Semi-custom being moved into that bucket doesn't reduce the appearances of that link. What would it mean for the GPUs or the consoles if semi-custom projects weren't giving some idea of where RTG is supposed to go?

and well, a lot of decision-makers at AMD/RTG are happy with the cryptomining-boom, leading to ... being too satisfied with the status quo. and thats dangerous for a company yelling against incrementalism.
If they are getting content with such a blip, I get the feeling their motivation prior to it wasn't further development in the art.
It's like being pleasantly surprised at extra catering profit on a restaurant that you were about to torch for the insurance money. Doesn't mean you can't still do it once the buffet's done.

Sometimes i think maybe "clocking the Shader-Cores lower" and "upping the Frontend" like nvidia did several years ago could be a way to better efficiency. ... meh i´m surely wrong
To some extent, AMD's direction was to tack on additional specialized work onto its internal graphics pipeline, but overlaid it on the same hardware foundation. The front end got longer and the CUs conscripted by it more cramped, which may have pushed the generalist architecture in too many directions simultaneously.

maybe building for tomorrow with focus on todays software, whould be smarter.
AMD did a few mis-speculations with going for the software from "The Day after Tomorrow".
There's also follow-through and iteration. Imperfect predictions can be worked through if iterated on, and even if predicted well there's a need for a healthy number of actual points (implementations) in the approximation of the curve.
 
yes iknow everything is very vague and surely almost every source has no full sight of everything, but with some awareness some can maybe see tendencies.
 
I think 7nm Vega would match Gtx1080Ti performance but AMD won't have much reason to bring it to gamer market if new chips from nvidia do the same without 4 stacks of HBM2. Nor would they do a bigger Vega chip with 6 shader engines, etc., come to think of it when they say scalability with navi do they mean scalability a la nvidia's scaling of chips where the front and the back ends increase with the shaders? No need for exotic ryzen like setup.
 
If they mean that the dies can be stuck together on top of a MCM or interposer, there would be no need to make a large expensive Navi die.

2x 1080 performance if they stuck 2 together would make sense for next gen high end.

If they can't do that or performance isn't good, then they will be in trouble.
 
Oh no, the first Navi gpus are meant for their best selling market segments who knew???

I'd also not be surprised at the multi die thing being delayed. Vega clearly shows Koduri made the classic engineering mistake of "do all the new things all at once and it'll all work out great!" Yeah, no. Even if he'd gotten ten times the budget it still doesn't mean this sort of thing works. You can look at other classic engineering headaches like the Boeing 7E7, which had all the budget and time in the world, and still came in many years delayed and over budget. If Boeing had been in a more high pressure, time sensitive market like computer hardware it would've sunk the company.

I'm sure AMD is glad to see the back of him and to bring on people with more sane, sensible, and achievable goals. Hitting the clockspeeds Vega was supposed to without requiring a nuclear reactor, getting tiled rasterization to work so the backup silicon can be gotten rid of, moving to 7nm before Nvidia, and say getting hardware level support for DirectX Raytracing are all Navi would probably have to achieve to be a success. And if Nvidia is really going to take that long to get to 7nm then there's no rush to get to a high end Navi either. Especially not with whatever Vega 20 is supposed to be filling in for the ultra high end compute market.
 
If it's a new RV770, maybe it can be good even against nVidia... ?
Given the AMD GCN product history it is more likely to be another Polaris - hyped, late and hot.

The MCM idea will surely be the next hype train thing. People hoping in "gluing" 4 mid-range Navis together... not embracing the simple truth.
 
Yeah, for me Navi = MCM , but it seems it's wrong, so... but AMD told that in some slides, no ?

No, they talked scalability, which can be everything. It can also mean that starting from navi everything in the chip will scale with the number of sms like nv is doing it with the polymorph engines. Vega with 64 CU still looks bad in most cases against Vega56 with the same clock.
 
Yeah, for me Navi = MCM , but it seems it's wrong, so... but AMD told that in some slides, no ?
No, they did not state that. The only real basis for this MCM nonsense is a slide having a "Scalability" label next to Navi. A GPU can scale on almost any metric. Scaling the number of dies in a super novelty way doesn't really work vs Occam's razor.
 
Status
Not open for further replies.
Back
Top