Next Generation Hardware Speculation with a Technical Spin [2018]

Status
Not open for further replies.
It does all sound like PS5's custom Navi is based on KUMA of the new uArch, so no more limitation to the 64cu count of GCN which is good. Does this not change the expectation of the reasonable teraflops we might end up getting? 14-15tf used to be the higher end but with KUMA this could be well within reach I hope.
The bottleneck for expected TFs was never a CU count limit but costs of silicon.
 
True but 7nm would more or less help right?
To provide us some perspective the MI60 is
Process Size: 7 nm
Transistors: 13,230 million (nearly double the X1X of 7B at 16nm, but not quite)
Die Size: 331 mm²
GPU Clock: 1200 MHz
Boost Clock: 1800 MHz
Memory Clock: 1000 MHz, 2000 MHz effective

64 CUs -- 14.72 TF
Somewhere in there you're going to need redundant CUs, and you're going to need to fit in a CPU and IO, audio blocks etc in there as well; I'm not exactly well versed on how chiplet shuffled things around for their design, but that's a lot to fit on 1 chip. The current generation consoles are about 360mm2, so if the price points are to stay around 399-499, so does the silicon. That's not including memory, storage and cooling costs.
 
What's the chance of a 400mm Apu?
I think it's a distinct possibility.

Also with how long AMD have been sitting on Navi I have expectations that it will have some nice features and improvements. Hopefully
 
Well it sure as hell ain't gonna be Navi 12 which is only 40CU, so what else is out there? Navi 20 custom it is.
This is why I almost begged people to stop calling PS4 and XB1 GPUs 7850's or "custom 7870" and whichever other similar names was used.
It won't be any PC GPU, just like PS4/Pro/XB1/XBX wasn't, it's not a "custom Navi XYZ" either, just like PS4/Pro/XB1/XBX wasn't.

Whatever CU counts AMD uses or will use in their Navi GPUs is irrelevant, Sony can use whatever amount they themselves choose (within limitations of the architecture, which in Navi's case may or may not be 64 CUs)
 
Demand is only one part of the equation.

The other, and arguably far more important part of the equation is how much does it cost in terms of silicon real estate to enable performant RT that will blow people away?

Right now, with NV's RT hardware the lower bound is a 2070 that offers performance impacted RT that some people find impressive and others less so. It's certainly doing RT faster than non-RT hardware, but is it doing it fast enough for games at a quality level that is an overall improvement in the game, versus just an improvement in specific areas of a game at the expense of other areas of a game?

In 2 years will acceptable quality and performance for games, be found in a hardware component not even a console, but just a component of a console be available for under 500 USD? Again, just for the hardware component, not even talking about a full blown console at this point.

And additionally will it be flexible enough that developers can adapt and use it in their games, regardless of the types of games they are making?

I'd argue that it isn't very likely. At some point in the future we all hope RT will be viable in a console, but next generation is not likely to be it for reasons of cost versus performance versus quality versus flexibility.

More than happy to be proven wrong, of course. :) But I just don't see it, at this point in time.

Regards,
SB
The first batch of games is using HRT just as an addon. A game designed eith RT in mind would be a different thing. Developers are still in the learning phase, you won't see crazy optimized algorithms for the time being.

Regarding the flexibility, what types of games couldn't use it?
 
To provide us some perspective the MI60 is
Process Size: 7 nm
Transistors: 13,230 million (nearly double the X1X of 7B at 16nm, but not quite)
Die Size: 331 mm²
GPU Clock: 1200 MHz
Boost Clock: 1800 MHz
Memory Clock: 1000 MHz, 2000 MHz effective

64 CUs -- 14.72 TF
Somewhere in there you're going to need redundant CUs, and you're going to need to fit in a CPU and IO, audio blocks etc in there as well; I'm not exactly well versed on how chiplet shuffled things around for their design, but that's a lot to fit on 1 chip. The current generation consoles are about 360mm2, so if the price points are to stay around 399-499, so does the silicon. That's not including memory, storage and cooling costs.
This chip also has a PCIe interface and a 4 chip HBM2 interface. Those won’t scale the same on 7nm as the core logic, so this chip size is a ceiling rather than a target for comparison purposes.

______________________________

Anandtech interview with Mark Papermaster is up. Some choice quotes below.

IC: With the FP units now capable of doing 256-bit on their own, is there a frequency drop when 256-bit code is run, similar to when Intel runs AVX2?

MP: No, we don’t anticipate any frequency decrease. We leveraged 7nm. One of the things that 7nm enables us is scale in terms of cores and FP execution. It is a true doubling because we didn’t only double the pipeline with, but we also doubled the load-store and the data pipe into it.

IC: Now the Zen 2 core has two 256-bit FP pipes, can users perform AVX512-esque calculations?

MP: At the full launch we’ll share with you exact configurations and what customers want to deploy around that.

IC: We expect lower core counts of Rome to eventually exist. This could mean fewer cores enabled per chiplet or just fewer chiplets entirely?

MP: We didn’t announce details, but you can see the physical modularity given the layout shared today. What you’re seeing today is eight 8-core clusters so that the easiest physical modularity is on a chiplet basis.

IC: With chiplets connected via IF on Rome, if a customer wanted a semi-custom design with different IP, such as a GPU or an AI block or an FPGA, would that be possible? (Say for example, a console?)

MP: Our semi-custom group is wide open to talk to customers to brainstorm! What excites me about the chiplet approach is that I think it’s going to disrupt the industry. It’s going to change the way the industry dreams of different configurations. Some might be right, and I can guarantee that someone will conjure up ten other ones that we didn’t think of! Honestly I think it is a disruptive force that is just nascent, just starting right now.

IC: With IF on 7nm, it offers 100 GB/s GPU to GPU connectivity. One of your competitors has something similar which allows both GPU-GPU and CPU-GPU connectivity. Currently with Rome, PCIe 4.0 has been announced from CPU to GPU but not IF. What has AMD’s analysis been on that CPU to GPU link?

MP: We haven’t announced applying the IF between the CPU and GPU and while it is certainly feasible, it is likely just dependent when workloads could truly leverage that protocol being applied, when the full coherency is required across both CPU and GPU. It is certainly feasible, but we haven’t announced it at this time.

My takeaways:
1. Seems to confirm an 8 core CCX with his wording.
2. Seems like AVX support is probable, else he would likely deny AVX512 outright.
3. Seems like we can’t take a monolithic die for granted. We’ll just have to wait and see.
 
Last edited:
Demand is only one part of the equation.

The other, and arguably far more important part of the equation is how much does it cost in terms of silicon real estate to enable performant RT that will blow people away?
Not to be an incessant gnat, but imo, nothing is more important than demand.

No demand, no one should make the product. If you keep making a product when there is no demand, you're flopping your business.
For new products / features, marketing plays a large role to drive hype and demand for these products. Failure to do so will flop the product.
That being said, I think these products do more than rely on hype for sales, there are problems being solved here, in particular on the development side of things. While we can debate to death whether the performance is there yet for the lower end RTX products, there's no doubt in my mind that developers see this as a win for them.

Right now, with NV's RT hardware the lower bound is a 2070 that offers performance impacted RT that some people find impressive and others less so. It's certainly doing RT faster than non-RT hardware, but is it doing it fast enough for games at a quality level that is an overall improvement in the game, versus just an improvement in specific areas of a game at the expense of other areas of a game?

In 2 years will acceptable quality and performance for games, be found in a hardware component not even a console, but just a component of a console be available for under 500 USD? Again, just for the hardware component, not even talking about a full blown console at this point.

And additionally will it be flexible enough that developers can adapt and use it in their games, regardless of the types of games they are making?

I'd argue that it isn't very likely. At some point in the future we all hope RT will be viable in a console, but next generation is not likely to be it for reasons of cost versus performance versus quality versus flexibility.

More than happy to be proven wrong, of course. :) But I just don't see it, at this point in time.
How light interacts with the scene is arguably the most important part of graphics. You can't see it if there's no lighting ;). And what is reflected back to us is how we perceive vision in real life. So how Ray Tracing handles lighting is probably going to be more related to our vision than rasterization.
I have no doubt in my mind like you that RT hardware is performing ray tracing order magnitudes of speed faster than non-RT hardware.
So the silicon imo, is justified in it's usage, it's doing something that is not capable without it.

Whereas on the flip side, I do not agree with the idea that 30%-50% more silicon devoted towards general compute would generate a graphical difference that something weaker could not achieve. As an OG Xbox One owner, 30% of that silicon went towards ESRAM, not even anything special like ray tracing or ML, but memory. And I am confident that all Sony exclusives would be able to run on Xbox One at 900p. As fantastic as those titles are, and they are indeed fantastic, those teams are not able to convert than 30% additional power into something the Xbox One could not play or do. If the barrier between more compute and less compute is nothing more than a resolution drop, I'm okay with this.

We talk about compute based solutions, sdf, software based solutions. That's great, and I know it will continue to evolve, and I applaud it. But that also means all those solutions, would work on hardware dating back to the OG Xbox One, because compute is generic, the only limitation being compute power where the load can be scaled using resolution.

But there is a 3rd point that sits at the developer angle. That is a lot of developers don't use the latest and greatest technologies right away, they have to build and retrofit their existing engines to incorporate new technologies. Ubisoft has several engines: Anvil, Dunia, Snowdrop, and all of these have variants as well, Anvil 2.0, Anvilnext, Anvilnext2.0. These engines are designed to build content that suits the type of game they are building. Thus all these techniques, may work for some engines but not others. Some may need massive sprawling meshes, and others don't. But one thing I know, is that they can incorporate DXR into all the engines and it would work the way they want it to because it doesn't have those hack limitations. But compute/rasterization does, and so they're spending tons and tons of time and labour to figure out which hack is going to get them what they need for the next scene/corridor vs. building in DXR into the engine and letting DXR handle all their cases for them, not to say there won't be performance tuning, as there always will be for both.

It's unlikely there is enough power to incorporate every single RT solution there is, but that's okay because there is always a rasterization fallback. And where rasterization is doing a fantastic job for your game, you should let it stay rasterized and benefit from the speed. And where it's not doing so great, or you're having a difficult implementation issue (ie limitations due to choices) or where you really want to amp it up (say a cutscene) you have RT there for you.

There is at least the option and that's what I'm seeing. But if next gen consoles don't come with RT, then it's honestly going to be more of the same. Xbox is already equipped for gpu-driven rendering, and so will next gen i'm willing to bet so I'm not really seeing a reason to discount the existing consoles if all the solutions are going to be even more compute based.

edit: https://www.gamewatcher.com/news/starfield-elder-scrolls-6-engine#
More justification for DXR here for me.
Even though the rumour originally was that they'd create a new engine for Starfield and Elder Scrolls VI, Bethesda have opted to stick with the same software they've been using, well, since Morrowind. Creation is also the Fallout 76 engine, and that's not necessarily an optimistic statement
---
It allows us to create worlds really fast and the modders know it really well," says Bethesda's Todd Howard for Gamestar. "That lets us be efficient and we think it works best." Whether it's worth the jank, though, is up for us to decide as end consumers.
Can't mod your way through situational tailored hacks.
 
Last edited:
Anandtech interview with Mark Papermaster is up. Some choice quotes below.

My takeaways:
1. Seems to confirm an 8 core CCX with his wording.
2. Seems like AVX support is probable, else he would likely deny AVX512 outright.
3. Seems like we can’t take a monolithic die for granted. We’ll just have to wait and see.
I don't see anything in there saying 8-cose CCX and I seriously doubt they would go there.

Assuming it still uses crossbar inside the CCX, 8 cores isn't really feasible as it would complicate the crossbar a LOT. Of course if they went for a Intel-like mesh solution that could work, but 8 cores seem too little for that. Ringbus would be probably only feasible solution inside the CCX if it had 8 cores.

Zeppelins were "8 core clusters" too, and it has 4 core CCXs.
 
This chip also has a PCIe interface and a 4 chip HBM2 interface. Those won’t scale the same on 7nm as the core logic, so this chip size is a ceiling rather than a target for comparison purposes.
Need to fix your font colours! It's all black on black on dark mode
 
To provide us some perspective the MI60 is
Process Size: 7 nm
Transistors: 13,230 million (nearly double the X1X of 7B at 16nm, but not quite)
Die Size: 331 mm²
GPU Clock: 1200 MHz
Boost Clock: 1800 MHz
Memory Clock: 1000 MHz, 2000 MHz effective

64 CUs -- 14.72 TF
Somewhere in there you're going to need redundant CUs, and you're going to need to fit in a CPU and IO, audio blocks etc in there as well; I'm not exactly well versed on how chiplet shuffled things around for their design, but that's a lot to fit on 1 chip. The current generation consoles are about 360mm2, so if the price points are to stay around 399-499, so does the silicon. That's not including memory, storage and cooling costs.

The MI60 though is slightly problematic for comparisons, because it's designed for a very different market than consoles and comes with a lot of extra baggage that the next gen consoles won't, like FP64 support and it's mostly a port from a 14nm design. A new 7nm design from AMD certainly can put those TFs with FP32 precision into a smaller die, the real question imo is, how much can they shave of the power consumption? because a 300W GPU isn't going to happen. Personally I wouldn't expect more than 12TF unless AMD has really found something impressive. I see that more as an upper limit than a lower bound.
 
I almost begged people to stop calling PS4 and XB1 GPUs 7850's or "custom 7870"... It won't be any PC GPU, just like PS4/Pro/XB1/XBX wasn't
Though these are custom dies, it's just convenient, for the sake of comparison, to assume that that Xbox One and PS4 are based on a GCN2 Bonaire core comparable to a Radeon HD7700/7800, and Xbox One X and PS4 Pro are based on a GCN4 Ellesmere core with Radeon RX 470/480 performance levels.

ain't gonna be Navi 12 which is only 40CU
I'm fine with any number as long as Navi is able to pull 12 TFLOPs.
 
Though these are custom dies, it's just convenient, for the sake of comparison, to assume that that Xbox One and PS4 are based on a GCN2 Bonaire core comparable to a Radeon HD7700/7800, and Xbox One X and PS4 Pro are based on a GCN4 Ellesmere core with Radeon RX 470/480 performance levels.
But it leads to posts like the one you quoted, too, where people think that it has to be based on specific PC GPU
 
https://wccftech.com/amds-david-wan...-dxr-until-its-offered-in-all-product-ranges/
This is unfortunate for my pro-RT stance on consoles. This will pare back the probabilities significantly.
If the R&D is not done by now, I'm not sure it could be delivered in time for launch.

“For the time being, AMD will definitely respond to Direct Raytracing,” for the moment we will focus on promoting the speed-up of offline CG production environments centered on AMD’s Radeon ProRender, which is offered free of charge ….. utilization of ray tracing games will not proceed unless we can offer ray tracing in all product ranges from low end to high end,” – David Wang, AMD in an interview to 4Gamer

I just went from 45% probable, to < 10%.
 
To provide us some perspective the MI60 is
Process Size: 7 nm
Transistors: 13,230 million (nearly double the X1X of 7B at 16nm, but not quite)
Die Size: 331 mm²
GPU Clock: 1200 MHz
Boost Clock: 1800 MHz
Memory Clock: 1000 MHz, 2000 MHz effective

64 CUs -- 14.72 TF
Somewhere in there you're going to need redundant CUs, and you're going to need to fit in a CPU and IO, audio blocks etc in there as well; I'm not exactly well versed on how chiplet shuffled things around for their design, but that's a lot to fit on 1 chip. The current generation consoles are about 360mm2, so if the price points are to stay around 399-499, so does the silicon. That's not including memory, storage and cooling costs.

Should we consider the possibility that Sony or MS could eat more initial losses for a much better next gen hardware compared to the PS4 and One but not as bad as Sony's losses with the early phases of the PS3?

PS4 estimated loss / h/w sold = few dollars.
PS3 esimated loss = 200-300 dollars.

Perhaps competition and window of respective releases might be a huge factor for this too.
 
Should we consider the possibility that Sony or MS could eat more initial losses for a much better next gen hardware compared to the PS4 and One but not as bad as Sony's losses with the early phases of the PS3?

PS4 estimated loss / h/w sold = few dollars.
PS3 esimated loss = 200-300 dollars.

Perhaps competition and window of respective releases might be a huge factor for this too.
I doubt it. Power doesn't sell consoles in the same fashion they used to, and this will likely be even harder next generation. 4K and VR are already marketed, HDR is marketed. What else can you really market from a hardware perspective.

Content, services, functionality, etc. The more that they can get for less are the major drivers. Leading in technical prowess helps with mindshare, but a smaller portion of the purchasing group are willing or setup to be able to afford it.
 
Last edited:
https://wccftech.com/amds-david-wan...-dxr-until-its-offered-in-all-product-ranges/
This is unfortunate for my pro-RT stance on consoles. This will pare back the probabilities significantly.
If the R&D is not done by now, I'm not sure it could be delivered in time for launch.



I just went from 45% probable, to < 10%.

utilization of ray tracing games will not proceed unless we can offer ray tracing in all product ranges from LOW END to high end,”

So it can still come with the release if the next gen consoles unless they're releasing early next year? *crosses fingers*

Maybe AMD's plan is to launch HRT/ML dedicated hardware across everything from laptops, desktops, APUs in a somewhat short timeframe then they could have a good chance to be the preferred way to program HRT/ML for games?
 
I doubt it. Power doesn't sell consoles in the same fashion they used to, and this will likely be even harder next generation. 4K and VR are already marketed, HDR is marketed. What else can you really market from a hardware perspective.

Content, services, functionality, etc. The more that they can get for less is what the major drivers are. Leading in technical prowess helps with mindshare, but a smaller portion of the purchasing group are willing or setup to be able to afford it.

Makes sense.

I bet both companies are looking if the X gains traction but I guess most would say a non factor for the majority. I'm interested to see how much the X gains market if Sony gives no hints of the PS5 to wet the appetite of console targets for 12 months from now.

On a side note, I have a feeling this gen was more comparison heavy compared to last gen and going to be worse next gen.

PS4's power on top could be more incidental as it's their first time in that position.
 
Status
Not open for further replies.
Back
Top