Next Generation Hardware Speculation with a Technical Spin [pre E3 2019]

Status
Not open for further replies.
Zen I with better/wider avx units ? Like a zen and zen2 hybrid ? Anyway, it'll be a big upgrade compared to what we have.

But I'm thinking more of a 6core 12 threads cpu, just because of yields. Or even 7 core cpus. Like, ok, it's an 8 cores cpu, with 7 enable (for yields), and 1cpu/2threads dédicaced to background os tasks.
 
Quoting from the message:

“IF the scale is right...”

When there is an IF there is a doubt, so, no one is claiming that is correct. The point was just to show that AMD expects big gains, and an illustration picture, if that is trully what she is (and we only have your oppinion on that, although I could very well agree) could very well point that, even if not accurate.

But personally, I would not state that the image is a mere illustration without any intent of precision (even if only a prediction at the time). Because I do know that for a fact.

I would also not say that the chart beeing old makes it stupid. If she is misleading in what AMD expects from Navi (or at least expected), then she would be stupid. But the mere fact of beeing old... just makes it old.
I think the intention of the graphic may be more related to the context in which it came up: as marketing, and possibly more for investors more than gamers.
It was intended to make people believe something--that AMD's graphics products had a timely path to significant progress and competitiveness.
Its lack of clarity about what it was measuring, what specific products represented the data points, and debatable placement on the Y and X axes does not point to an intent to be accurate or informative.

Even at the time, attempts to analyze the implications of the chart and be charitable in predicting what data points AMD was using without saying did not really give a healthy picture versus the competition .
The timing for 28nm products could likely be counting some of the less impressive 28nm chips like the early Tonga desktop cards (definitely not the HBM-based Fury Nano). The Polaris data point doesn't seem to have been the initially troubled rollout of Polaris 10. I've seen some speculation that this needs to use a mobile Polaris SKU of some kind to get enough Y distance between the nodes. That's not comparing products with similar use cases, which I wouldn't say is evidence of an intent to inform.
Even at the time the image came out, it was getting kind of rough reconciling the launch dates in reality versus the graphic, and it's basically meaningless on that axis now.

Overall, if the idea is to lead people to make a favorable prediction, that might be more what AMD wanted. It didn't offer enough to do much else and what we've seen since has left little more than the wishful thinking at the time in evidence.


Is this possible for the Next Gen?

2S0FRWZ.png


Thanks for your time.
It's a pretty simple set of boxes and labels, which doesn't say much about what they do or what can be expected of them.
One little thing is that the L1 I$ just has arrows feeding into it, which given the rules set down by the diagram doesn't seem to indicate it can do anything.

By asking if this is possible for a future design, it's low-detail and it can be argued that it's not impossible--so long as the definition and behaviors of the various blocks are changed in undisclosed ways. There's some interpretation that could be dreamed up that isn't outright broken, probably.
What has been omitted, like a decent portion of the geometry front end that would have been alongside the DSBR and primitive assembler, the workload distributor, shader launch hardware, etc., could be present but not drawn. Or is there an unspoken claim of software replacement?
The ratio of front-end hardware to compute is notably high per GCX versus what's been in most shader engines. It's not outright impossible, but from drivers and other programming discussions I think the minimum amount of geometry wavefront allocation would likely seriously impede pixel and compute progress.

There's a bidirectional arrow between each CU and its L1D, which is fine I guess, though I'm not sure what having that arrow between a CU and its own subunit tells us. Unless there's an unspoken claim about separating the L1 from the CU.

The RBE's in a strange place, going by what what we know they do in current architecture. That doesn't rule out an unspoken claim about changing the behavior significantly. However, where it lies is an odd position. Current designs have the RBE on a dedicated data path that CUs arbitrate access to and put data on from their vector registers. Some patents that may concern themselves with a next-gen GPU still have the RBEs on some kind of vector export path. In this diagram, the L1s are in the way. It is possible to dream up a way to make this work, but we have little basis for it.

The diagram has the caches and RBE feeding into what seems to be an IF block, which again isn't impossible if a lot of changes were made to everything involved, but that's speculation with little basis and probably a host of problems.
I suppose that IF block is part of the data fabric block in the lower part of the diagram, which goes to the L2, which goes to the memory controller.
I would say that if any of these elements are similar to what we know now, there would likely be serious problems.
One issue is that while this diagram doesn't need to be accurate as to the exact widths of data paths or links, there's a significant number of connections in the upper half versus what becomes almost a flow chart in the bottom half.
I think that if we took the arrows literally, there'd be dozens of caches going into a fabric that then has one link to an L2.

I'd say that if we're extrapolating from current designs, the IF is not in a good place. The L1s are not coherent with one another, so a good portion of the fabric's capability is not wanted between the L1s and L2. The RBE is not coherent at all, and in Vega plugs into the L2 rather than into a coherent fabric. In this diagram, the bandwidths involved are unclear, though it seems like there would be serious contention for bandwidth within a GCX and a straw for data to get out of the L2.
The front-end hardware in the GCX may generate data relevant to other GCX blocks, unless there's an unspoken claim about how the front-end communications work in this design, and they're now broadcasting into that nest of L1s and data fabric blocks. The L2 sort of loses its place as a coherency point if the fabric is on the wrong side of it, and now there's no fabric linking the L2 to anything else like other hardware blocks or the system at large.
IF in Vega links up a set of memory controllers and some number of L2 stops, and it's already noticeable in how much area it takes up. Linking up as many clients as there are CUs, RBEs, and other caches significantly overloads what we'd consider a known implementation of IF, unless the clients see much more starvation than they do currently.
Vega 10's IF section is not overwhelmingly large, but in comparing Vega 10 to earlier non-IF GPUs it seems like it takes up a lot more area versus what linked the L2s to the memory controllers before. Vega 20 has a lot of non-GPU area, and possibly an area all around the die that's just the IF mesh. This proposal may make the IF take up a lot of area, and barring some proposed change in the behavior of the sub-units, many of them do not want to make use of much of its features.
 
I belive you to be correct 3diletante.
But I believe that the purpose of the chart, even if innacurate, intended to show investors that AMD would bet on power consumption reduction, and that each future iteration would have big gains on that regard.
I believe the whole discussion about the chart has to do with the pixel measurement. Well, I just happened to found that image, so I used it since it showed what I was saying. Prediction, mere ideology, schematic, or whatever, AMD tryed to show that performance per watt would be a bet for the future and that we can expect power gains to be big.
They were on Polaris, they were on Vega, and we have data to believe they will be on Navi, like the recent patches on linux that reveal a preparation for a new System Management Unit to be found on future ASICs, responsible for power management tasks. Since Navi is not even released this must be for it!
Besides, we also had reports that some Engineers from the Ryzen team were diverted to Navi to implement changes in power consumptions, and lots of rumours, like the one from chiphell, that claims Navi power consumption is set to be surprising.
Truth or not, power consumption reductions in Navi seem to be set to be superior to the ones offered by the mere node reduction. And if the node reduction allows, as claimed by tsmc, something between 40 to 60%... we can expect more than that.
 
Last edited:

Benchmark results vs Vega 64. Mountain of salt and all of that.

https://compubench.com/compare.jsp?benchmark=compu15d&did1=71874520&os1=Windows&api1=cl&hwtype1=dGPU&hwname1=AMD+66AF:F1&D2=AMD+Radeon(TM)+RX+Vega


Observations from reddit:

Very interesting results.

Face detection is pure Compute and it loses to V56 there badly, suggesting a low peak Tflops.

However, other workloads that are graphics, it wins.

In the ocean surface simulation which is geometry heavy, it wins by a massive margin.

Dare we hope, Navi = graphics optimized uarch of GCN?


Ocean surface simulation is FFTs which are ridiculously memory bandwidth bound. Rather than navi I'd guess it's a new vega 20 variant.
 
Last edited:
I believe Microsoft and SONY are working towards the same thing, & as pointed out may end up with similar specs, but going a different route. That different route is probably because each sign exclusivity to their design with AMD. And I think both SONY & Microsoft are holding back and putting more into the Consoles power and capabilities, than into cooling, or cost of manufacturing/design. Even older controller's will work... both with be a SFF FRag BoX.

I also think AMD's Infinity Fabric (IF) will take up a bigger roll in the upcoming consoles, as I think it will in all AMD's APUs. And perhaps, AMD's chiplet design will allow for multiple paths for future "upscaling" of their Consoles. And allowing SONY & MS to offer tiered version of the next gen Consoles.



Wrongly speaking, but for illustrative means:
What if Tensor cores become the "next thing" for Console style games, in which Gamer's need/consumed more of them. Can't AMD just spin off a tensor farm chip, & add it to their chiplet and use the IF to tie the tensor farm into the GPU... ?
 
Zen 2 confirmed for PS5?

https://reviews.llvm.org/D58343
Reviewers
RKSimon
craig.topper
Commits
rL354897: [X86] AMD znver2 enablement
rGe172d7008d0c: [X86] AMD znver2 enablement
SUMMARY
This patch enables the following

  1. AMD family 17h "znver2" tune flag (-march, -mcpu).
  2. ISAs that are enabled for "znver2" architecture.
  3. For the time being, it uses the znver1 scheduler model.
  4. Tests are updated.

Here's the source in case anyone wants to verify Zen2 is just a copy of Zen1 features for now:

https://llvm.org/doxygen/Host_8cpp_source.html

Edit: added question mark.
 
Last edited:
Zen 2 confirmed for PS5.

https://reviews.llvm.org/D58343
Reviewers
RKSimon
craig.topper
Commits
rL354897: [X86] AMD znver2 enablement
rGe172d7008d0c: [X86] AMD znver2 enablement
SUMMARY
This patch enables the following

  1. AMD family 17h "znver2" tune flag (-march, -mcpu).
  2. ISAs that are enabled for "znver2" architecture.
  3. For the time being, it uses the znver1 scheduler model.
  4. Tests are updated.

Here's the source in case anyone wants to verify Zen2 is just a copy of Zen1 features for now:

https://llvm.org/doxygen/Host_8cpp_source.html
how does this link happen? or rather, how do we know it's confirmed via a compiler update? Just curious to follow through this logic. Seems rather interesting way to accidentally leak your bits.

I'm looking at the committed changes, and I don't understand the 'ah ha' that this is PS5 confirmed can be made.
 
how does this link happen? or rather, how do we know it's confirmed via a compiler update? Just curious to follow through this logic. Seems rather interesting way to accidentally leak your bits.

I'm looking at the committed changes, and I don't understand the 'ah ha' that this is PS5 confirmed can be made.

By the committer, are they associated with Sony?

Scratch that, they are with AMD, https://reviews.llvm.org/p/GGanesh/ who is
https://www.researchgate.net/profile/Ganesh_Gopalasubramanian.
 
how does this link happen? or rather, how do we know it's confirmed via a compiler update? Just curious to follow through this logic. Seems rather interesting way to accidentally leak your bits.

I'm looking at the committed changes, and I don't understand the 'ah ha' that this is PS5 confirmed can be made.
Simon Pilgrim (RKSimon) is a lead compiler engineer for Sony. He has dozens and dozens of commits to his name, and he’s a reviewer on a ton of stuff, including a lot of AVX512 commits.
 
Last edited:
Simon Pilgrim (RKSimon) is a lead compiler engineer for Sony.

But he's just a code reviewer for this change. If LLVM process is similiar to all projects I've been involved in ... At work I often review UI code and Oracle SQL but I don't work in those groups. He's just looking for obvious mistakes or improvements to make or asks why some esoteric things are done. Internet reading far too much into code reviews.
 
But he's just a code reviewer for this change. If LLVM process is similiar to all projects I've been involved in ... At work I often review UI code and Oracle SQL but I don't work in those groups. He's just looking for obvious mistakes or improvements to make or asks why some esoteric things are done. Internet reading far too much into code reviews.
He has numerous hits per day, at all times of day, including the weekend. I don’t know how much obvious it could be that it’s a large part of his job. SN systems specifically work on compiler optimizations all the time and often present at LLVM developer meetings.
 
He has numerous hits per day, at all times of day, including the weekend. I don’t know how much obvious it could be that it’s a large part of his job. SN systems specifically work on compiler optimizations all the time and often present at LLVM developer meetings.

Yeah, that's usually what senior engineers schedules are like, just like my timeline at my place of work, and I still wouldn't say his reviewing of others' checkins means it has anything to do with the PS5. Only obvious thing to me and everyone should be that he's making sure those checkins do not break the entire LLVM project for what Sony uses it for. That's all. Nothing more.
 
Yeah, that's usually what senior engineers schedules are like, just like my timeline at my place of work, and I still wouldn't say his reviewing of others' checkins means it has anything to do with the PS5. Only obvious thing to me and everyone should be that he's making sure those checkins do not break the entire LLVM project for what Sony uses it for. That's all. Nothing more.
Like all the AVX512 code they manage?

These very same type of LLVM commits were spotted and ran as confirmation that Sony was using a Zen derivative as of last year. Why didn’t anyone challenge it when Phoronix first published it?
 
Last edited:
Simon Pilgrim (RKSimon) is a lead compiler engineer for Sony. He has dozens and dozens of commits to his name, and he’s a reviewer on a ton of stuff, including a lot of AVX512 commits.
Right, I get that part, but why commit back to the original? I can't see how this is related to Sony.
If you're making huge strides on Sony's compiler, and you only need to compile for 1 platform type, why send the changes back to the main branch so that their competitors can benefit from the same optimizations. It doesn't seem to make a lot of sense, there's only 1 way I know of to load your game onto a playstation console and that's through SN Target, so I don't see the reasoning here to commit changes back.

Like usually there is a hard separation between your work and open source, and I've seldom see a lot of the two overlap. How do we know he isn't just doing this because he likes to?
 
Right, I get that part, but why commit back to the original? I can't see how this is related to Sony.
If you're making huge strides on Sony's compiler, and you only need to compile for 1 platform type, why send the changes back to the main branch so that their competitors can benefit from the same optimizations. It doesn't seem to make a lot of sense, there's only 1 way I know of to load your game onto a playstation console and that's through SN Target, so I don't see the reasoning here to commit changes back.

Like usually there is a hard separation between your work and open source, and I've seldom see a lot of the two overlap. How do we know he isn't just doing this because he likes to?
Sony are heavily committed to LLVM and regularly present at their conferences. They’re also active in contributing to MCA exegesis. They’re doing it for the developer community that uses their platforms.

You can go back further to PS3 and see their activity with FreeBSD.

And now we’re suggesting he’s doing this for funsies on company time just because?
 
Sony are heavily committed to LLVM and regularly present at their conferences. They’re also active in contributing to MCA exegesis. They’re doing it for the developer community that uses their platforms.

You can go back further to PS3 and see their activity with FreeBSD.

And now we’re suggesting he’s doing this for funsies on company time just because?
Yea but they support more than 1 processor, they support as much as possible. Why does seeing Zen 2 make it a lock? There's a lot of other reasons why I think it's Zen 2, I'm just trying to see the reasoning here on why this is a lock confirm.

Basically, I'm asking is there no other possibility at all that could be a reason for them to commit Znver1 and Znver2? Does this _have_ to be linked to PS5 improvements?

A lot of developers will contribute to different projects they don't own, I wouldn't say for funsies, but because they like to contribute to those projects.

I mean straight from wikipedia:
LLVM is written in C++ and is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages. Originally implemented for C and C++, the language-agnostic design of LLVM has since spawned a wide variety of front ends: languages with compilers that use LLVM include ActionScript, Ada, C#,[4][5][6] Common Lisp, Crystal, CUDA, D, Delphi, Fortran, Graphical G Programming Language,[7] Halide, Haskell, Java bytecode, Julia, Kotlin, Lua, Objective-C, OpenGL Shading Language, Pony,[8] Python, R, Ruby,[9] Rust, Scala,[10] Swift, and Xojo.

That's a lot of stuff LLVM can support here.

Hypothetical situation here, but lets just say like MS, they decide to run servers full of EPYC and Zen2. Are you saying, looking at the above languages, you wouldn't want to improve the compiler for Zen2 and EYPC type processors? Does a compiler team for Sony _only_ work on Playstation hardware?
 
Yea but they support more than 1 processor, they support as much as possible. Why does seeing Zen 2 make it a lock? There's a lot of other reasons why I think it's Zen 2, I'm just trying to see the reasoning here on why this is a lock confirm.

Basically, I'm asking is there no other possibility at all that could be a reason for them to commit Znver1 and Znver2? Does this _have_ to be linked to PS5 improvements?

A lot of developers will contribute to different projects they don't own, I wouldn't say for funsies, but because they like to contribute to those projects.

I mean straight from wikipedia:


That's a lot of stuff LLVM can support here.

Hypothetical situation here, but lets just say like MS, they decide to run servers full of EPYC and Zen2. Are you saying, looking at the above languages, you wouldn't want to improve the compiler for Zen2 and EYPC type processors?
Then you need to answer your own question. How does this benefit Sony?
 
Then you need to answer your own question. How does this benefit Sony?
I'm am neither agreeing or disagreeing with your point. Zen2 is 7nm designed by default, we know next gen is 7nm. Otherwise you gotta pay costs to shrink Zen1 into 7nm. That's a strong argument.

A lot of studios have tools that they use. A lot tools can be coded in C#, Lua, whatever it may be. If you're using Ryzen processors because you have a deal with AMD and it's cheaper than running a massive Corei7 because of threadripper, why not make compilation optimization so that those tools run better. You're getting better performance for cheaper and that can be across all the studios that develop tools for Sony or for themselves. But those tools are being run on Zen processors, not necessarily on the console. You still get massive gains because anytime we can get faster iteration we get lower development costs.
 
Status
Not open for further replies.
Back
Top