AMD: Speculation, Rumors, and Discussion (Archive)

Status
Not open for further replies.
Earlier discussions tried to sort out the usage of each entry:

https://forum.beyond3d.com/threads/...peculation-rumors-and-discussion.56719/page-9

There seem to be some tweaks, but it's going back to a minor bump in version from Tonga for everything but multimedia. Is DCE related to compute? That seems to be inherited from Carrizo. (edit: Or is it the display controller?)
The lack of a *new* tag on things like the rasterizer and color blocks could be what is housed under the GFX label.

DCE is Display Controller Engine. 1st seen it in the PS4 leaked specs but they never said what it was but I found it later.
 
When looking at the IP levels, there seem to be increases for quite a bit of blocks.

So it really depends what the gfx IP level covers. If it's just the shader core and not things like ROPs and other essential blocks that are not part of the shaders, then they could still add those feature levels while keeping the shaders identical.

There definitely are changes in their codecs. So that concern is probably covered as well.

The GFX block is big - command processors, graphics & compute pipelines, shader core/ISA, CBs/DB (ROPs)... I think texture cache/filtering is in there too but not sure ATM.

3dilettante, the compute block you're thinking of might be MEC - MicroEngine Compute ? Each MEC block manages 4 "pipes" each supporting up to 8 "queues" (rings).
 
ANOTHER Linkdin leak? http://wccftech.com/amd-vega-10-4096-stream-processors/

Either AMD has been totally negligent in its NDA agreements and checking for leaks, or these are somehow fake? Either way it's odd for so many "leaks" from the same source to happen in a row, let alone for so many people in a row to make the same mistake of putting assumedly NDAd specifics of chips not out yet on their resume. Oh yes, "let's show that we happily break NDA stuff on our resume, via our resume! This is surely a good way to get hired."

But hell, who knows. Maybe it's true, all of it *Han Solo face.
 
So, ahem, AMD seems to be making mainstream and performance GPUs based upon Polaris that cut down ALUs, just in time for async compute games to come along and demand more compute :sleep:

Perhaps 4096 ALUs is the little Vega.

If it's a "new, efficient" architecture, perhaps we're talking extreme ALUs. Sorry, old old joke.
 
The GFX block is big - command processors, graphics & compute pipelines, shader core/ISA, CBs/DB (ROPs)... I think texture cache/filtering is in there too but not sure ATM.
Ok, thanks for clarifying.

That strengthens my view that Polaris will see perf/W improvements due to process (known) and low level improvements (things like sequential clock gating maybe?), and that Vega will see additional perf/W improvements due to HBM and architecture changes.

I don't think it's a bad way of doing things: it may have been the only way to get a 16nm chip out of the door in time. But I also don't expect ground breaking competitive perf/W either.

Or the v8 IP level for Polaris is just smoke and mirrors and Raja was actually telling the truth about it being a full new architecture... But then what is Vega?
 
Well they did mention their IPC is going to better for Polaris, I would think that would be the same for Vega, so maybe 4096 is what they went for?
 
So, ahem, AMD seems to be making mainstream and performance GPUs based upon Polaris that cut down ALUs, just in time for async compute games to come along and demand more compute :sleep:

Perhaps 4096 ALUs is the little Vega.

If it's a "new, efficient" architecture, perhaps we're talking extreme ALUs. Sorry, old old joke.


18 billions transistors for 4096 ALUs ? something dont match here...
 
ANOTHER Linkdin leak? http://wccftech.com/amd-vega-10-4096-stream-processors/

Either AMD has been totally negligent in its NDA agreements and checking for leaks, or these are somehow fake? Either way it's odd for so many "leaks" from the same source to happen in a row, let alone for so many people in a row to make the same mistake of putting assumedly NDAd specifics of chips not out yet on their resume. Oh yes, "let's show that we happily break NDA stuff on our resume, via our resume! This is surely a good way to get hired."

But hell, who knows. Maybe it's true, all of it *Han Solo face.

It's the same linkedin leak from pcgh, except videocardz feel that the Greenland SoC might be a Vega chip, not a far-fetched assumption to make. 4096 is a nice round number and very likely show up on a vega chip. Likelier to be vega11(assuming vega stack has the same numbering as polaris) than vega10 unless AMD are cutting down shaders to improve other parts. 2304 for polaris 10, 3000 odd for vega 11 and 4096 for the vega10.
It also has gfx ip 9.0 and I'm guessing that is feature level 12_1 support.

All this while AMD's open source driver removes mentions of no. of shaders to bios only so as to keep them secret.

https://semiaccurate.com/forums/showpost.php?p=258461&postcount=514

Though some folks think they can divine the stock clocks from the driver.

http://forums.overclockers.co.uk/showpost.php?p=29320384&postcount=2146
 
Well they did mention their IPC is going to better for Polaris, I would think that would be the same for Vega, so maybe 4096 is what they went for?
I think somewhere along the line AMD talked about instruction caching having an effect on IPC.

I don't buy it. There would need to be multiple massive shaders/kernels trying to run on a set of CUs simultaneously to get even close to exhausting instruction cache. I believe 4 CUs share an instruction cache of 32KB.

The other hits on IPC are taken branches (with whole hardware thread coherence - not much point talking about divergence) and waits.

Branches are already very low cost in GCN.

Waits are a whole other ballgame. The only time waits are really a problem is when GPU allocation is very high. Peculiarly there isn't much register allocation margin between 10 wavefronts per SIMD (the maximum) and say 6, where intense branching and waiting will cause ALU stall.

The register file is simply too small for complex shaders given the current way GCN works and the stupidity of the compiler which always maximises register allocation in favour of issuing less instructions.
 
The register file is simply too small for complex shaders given the current way GCN works and the stupidity of the compiler which always maximises register allocation in favour of issuing less instructions.
Where did you hear this? I'd like to read about it. Also why doesn't AMD just fix the compiler?
 
Many years writing shaders. Modern compilers are CPU centric. They have no concept of the threads in flight versus register allocation trade off. AMD's compiler will allocate registers until only 1 hardware thread can be in flight.
 
Why hasn't AMD tried to fix this behaviour?
edit- basically why are they choosing to be dependent on new hardware?
also thanks.
 
Last edited:
Status
Not open for further replies.
Back
Top