The next Nintendo hardware? [2021-09]

Yes, Xbox has 1 generation of easy(ier) native backwards compatibility while the other generations require additional software tools along with VMs. VMs by themselves won't solve everything, but they can make it a easier than if they weren't used.

I disagree on the scope of older compatible games, they're limited by legal restrictions more so than technical restrictions.
Even Xbox One games run in a VM on Series consoles. They also run in a VM on Xbox One. So can Series consoles really run One games natively if it's virtualized?
 



The difference being that AMD has now turned over a new leaf. Ever since GCN, AMD has started maintaining binary compatibility for their GPUs just like they do for their x86 CPUs. Nvidia has yet to show that their willing to do the same. Unless Nvidia drops their future plans for introducing incompatible architectures or assign a dedicated hardware design team to specifically extend the Maxwell architecture, their partner's options on retaining binary compatibility remains limited ...

I see. But how hard would it be to make a hawrdware module inside the GPU chip that is hard wired to translate Switch instructions to Switch 2 instruction set? Isn't that what pretty much every modern x86 processor does internally and its said to take negligible area and power?
 



The difference being that AMD has now turned over a new leaf. Ever since GCN, AMD has started maintaining binary compatibility for their GPUs just like they do for their x86 CPUs. Nvidia has yet to show that their willing to do the same. Unless Nvidia drops their future plans for introducing incompatible architectures or assign a dedicated hardware design team to specifically extend the Maxwell architecture, their partner's options on retaining binary compatibility remains limited ...

I mean i use an emulator for switch.


They all seem to emulate it really well at higher resolutions and performance in some games.
 
The Cortex A78 is 250% faster than Jaguar?

Unless we're talking about 3GHz A78 vs. 1.6GHz Jaguar, I'm pressing X to doubt.
 
Last edited by a moderator:
I see. But how hard would it be to make a hawrdware module inside the GPU chip that is hard wired to translate Switch instructions to Switch 2 instruction set? Isn't that what pretty much every modern x86 processor does internally and its said to take negligible area and power?

There'd still be tons of duplication or redundant logic built into the chip if you want to be at native speed. x86 processors still have quite a bit of redundant logic inside them just to remain compatible with legacy instruction encodings. Intel recently started crippling (page 463) MMX instructions and there were significant slowdowns going on with some of the older programs because of that design decision ...

You're probably better off just including the entire GPU unchanged as is on the die (like AMD/Nintendo for their last system) rather than creating a monstrosity to be binary compatible with two or more architectures which would still have tons of bloat depending on your performance targets for legacy applications. Sharing logic may seem to be a noble idea at first but it makes validation much harder and the area savings aren't all that convincing if the implementation has to be bit for bit exact with the previous implementation. At that point, extending the native Maxwell architecture logic would entail the same amount of design work ...

I mean i use an emulator for switch.


They all seem to emulate it really well at higher resolutions and performance in some games.

The author of that video mostly agrees with Michael Scires assessment regarding the feasibility of backwards compatibility behind a potential successor. He too doesn't think it's possible to retain backwards compatibility without the Maxwell architecture ...
 
The author of that video mostly agrees with Michael Scires assessment regarding the feasibility of backwards compatibility behind a potential successor. He too doesn't think it's possible to retain backwards compatibility without the Maxwell architecture ...

They can just stick a GPU part of X1 integrated on new SoC to be used for Switch compatibility mode. They have done it with Wii U before.
 
They can just stick a GPU part of X1 integrated on new SoC to be used for Switch compatibility mode. They have done it with Wii U before.

How much area would they have left to make any significant performance improvements ? With Latte, integrating the Hollywood GPU on die didn't come at a significant cost and practically came out to being ~1/8 of the total die area. By including the Tegra X1 GPU in the new design, we could be talking about as much as 1B+ transistors going down the drain to maintain compatibility which could've been used purely for either performance/efficiency enhancements or cost savings. A more advanced process technology would be needed to circumvent the die area consumption problem and this would be an even bigger deal for mobile SoCs since they need every last bit of transistors they can use to meet higher power efficiency targets ...

If the Tegra X1 GPU was ported to the 7nm process, it could take as much as ~1/4 of the total die space from between 100-120mm^2 ... (this is not counting how much more area would be needed to improve the CPU as well)
 
How much area would they have left to make any significant performance improvements ? With Latte, integrating the Hollywood GPU on die didn't come at a significant cost and practically came out to being ~1/8 of the total die area. By including the Tegra X1 GPU in the new design, we could be talking about as much as 1B+ transistors going down the drain to maintain compatibility which could've been used purely for either performance/efficiency enhancements or cost savings. A more advanced process technology would be needed to circumvent the die area consumption problem and this would be an even bigger deal for mobile SoCs since they need every last bit of transistors they can use to meet higher power efficiency targets ...

If the Tegra X1 GPU was ported to the 7nm process, it could take as much as ~1/4 of the total die space from between 100-120mm^2 ... (this is not counting how much more area would be needed to improve the CPU as well)

If that is what needs to be done, sacrificing a bit of performance/efficiency for maintaining full backward compatibility would be a worthy tradeoff. Though I expect they can come up with a far more elegant solution.
I honestly think this potential GPU incompatibility issue is way overblown tbh. :no:
 
If the Tegra X1 GPU was ported to the 7nm process, it could take as much as ~1/4 of the total die space from between 100-120mm^2 ... (this is not counting how much more area would be needed to improve the CPU as well)

If Nintendo needed to dedicate dye space to Maxwell architecture, im not so sure they wouldn't just stick with the architecture. Yes, I know that would be extremely unpopular here, but if a more elegant solution cant be found or is far more expensive to implement, im not convinced they wouldn't stick with the same architecture. The reduced performance from the older architecture would be offset by not having any wasted space for BC. The Tegra X1 also had quite a bit of useless dye space for non gaming related hardware. On the 7nm process, something like 1024 Cuda cores would fit within the dye space limitations. Four Tegra X1's duck taped together. I cant shake the feeling that this is not only possible, but because of Nintendo's history, its actually plausible.
 
If Nintendo needed to dedicate dye space to Maxwell architecture, im not so sure they wouldn't just stick with the architecture. Yes, I know that would be extremely unpopular here, but if a more elegant solution cant be found or is far more expensive to implement, im not convinced they wouldn't stick with the same architecture. The reduced performance from the older architecture would be offset by not having any wasted space for BC. The Tegra X1 also had quite a bit of useless dye space for non gaming related hardware. On the 7nm process, something like 1024 Cuda cores would fit within the dye space limitations. Four Tegra X1's duck taped together. I cant shake the feeling that this is not only possible, but because of Nintendo's history, its actually plausible.

It's not just reduced performance/efficiency that would be contentious but you potentially might not even have new hardware features like tensor cores or RT as well depending on how much or little design work went into souping up Maxwell and I doubt Nvidia would let their partner stick with their older architecture because they eventually plan on dropping driver/software support for Maxwell altogether. It becomes less appealing too for Nvidia since Nintendo would effectively stop subsidizing the development of their future products so there's even less incentives for Nvidia to reassign their hardware design team to work on Maxwell again. How plausible of a scenario do you think it is for Nvidia to design a chip only to end up never having to give driver support for it ?
 
It's not just reduced performance/efficiency that would be contentious but you potentially might not even have new hardware features like tensor cores or RT as well depending on how much or little design work went into souping up Maxwell and I doubt Nvidia would let their partner stick with their older architecture because they eventually plan on dropping driver/software support for Maxwell altogether. It becomes less appealing too for Nvidia since Nintendo would effectively stop subsidizing the development of their future products so there's even less incentives for Nvidia to reassign their hardware design team to work on Maxwell again. How plausible of a scenario do you think it is for Nvidia to design a chip only to end up never having to give driver support for it ?

Driver support is a PC thing, so its not really relevant in a closed box scenario of a console. Nvidia isn't exactly pushing to get back into mobile, so whatever they decide to do with Nintendo is going to be a one off, and they will be setting up the development environment. Using a very mature architecture means there is very little work for Nvidia that hasn't already been done. Even the software development tools they developed for Switch would still largely be relevant, for developers, it would be almost identical to developing on Switch except they would have a 4x performance boost. Tensor cores are not bound to any specific GPU architecture but even if they are out of the question, Nintendo has a patent filed for some sort of propriety DLSS software and perhaps it would be more similar to AMD's FidelityFX. I'm not convinced that Nintendo will have super high demands for performance compared to the retaining backwards compatibility.

BC is not a real feature if it only works with titles which the devs took the time to patch.

And this could be the alternative solution. Nintendo could choose to make sure their most popular titles from Switch are playable on the Switch 2. Basically going down the route that Microsoft did with the Xbox One 360 BC. They could even make this part of the online subscription service.
 
Driver support is a PC thing, so its not really relevant in a closed box scenario of a console. Nvidia isn't exactly pushing to get back into mobile, so whatever they decide to do with Nintendo is going to be a one off, and they will be setting up the development environment. Using a very mature architecture means there is very little work for Nvidia that hasn't already been done. Even the software development tools they developed for Switch would still largely be relevant, for developers, it would be almost identical to developing on Switch except they would have a 4x performance boost. Tensor cores are not bound to any specific GPU architecture but even if they are out of the question, Nintendo has a patent filed for some sort of propriety DLSS software and perhaps it would be more similar to AMD's FidelityFX. I'm not convinced that Nintendo will have super high demands for performance compared to the retaining backwards compatibility.

Sure driver support doesn't apply to consoles but if Nvidia were contemplating about the prospects of reusing the new design for say embedded applications then it wouldn't be attractive proposition for them if it meant extending driver support for more architectures. All hardware vendors would prefer to drop driver code for old hardware as soon as possible if they can rather than maintain it. I imagine a similar thought process was going on during the time at AMD when they were designing Latte since extending the existing Hollywood GPU design wasn't worth taking off their design team from more important projects that could be used for the future so they just included it on the die as another separate GPU ...

I would think that being able to reuse some of the work would factor into the hardware vendors design decisions ...
 
The Cortex A78 is 250% faster than Jaguar?
It's an unqualified comparison. Stronger in what way? Peak possible throughput? Flops per watt? Real-world attained workload (per watt)? Latter seems highly plausible and attainable given limitations of PS4's Jaguars.
 



The difference being that AMD has now turned over a new leaf. Ever since GCN, AMD has started maintaining binary compatibility for their GPUs just like they do for their x86 CPUs. Nvidia has yet to show that their willing to do the same. Unless Nvidia drops their future plans for introducing incompatible architectures or assign a dedicated hardware design team to specifically extend the Maxwell architecture, their partner's options on retaining binary compatibility remains limited ...

And why wouldnt Nvidia assign a team and make a custom uarch for Nintendo? Leather Jacket went on record at an investors call saying he expected this partnership to last two decades. Seems unlikely Nintendo signs on for 20 years without thinking about how to future proof the platform

Nvidia burned their bridges with Sony and MS on the first attempt so this would be the first time they are responsible for the successor
 
Nvidia burned their bridges with Sony and MS on the first attempt so this would be the first time they are responsible for the successor
I've not read anything to indicate that Nvidia burned any bridges with Sony with PS3's RSX. Providing a chip - at what was relatively last minute - was pretty good. I think it was simply the case that AMD was the better prospect for PS4 and now AMD are in, they're probably in the long haul given the way nobody is keen on jumping CPU and/or GPU architectures every generation.

They certainly burnt bridges with Microsoft, and outside of the console space, with Apple. Nvidia seem a bit prone to pissing off partners.
 
It's an unqualified comparison. Stronger in what way? Peak possible throughput? Flops per watt? Real-world attained workload (per watt)? Latter seems highly plausible and attainable given limitations of PS4's Jaguars.
If we look at gaming performance comparisons in PC benchmarks, I'd say "Stronger in Real Absolute Gaming Performance", which historically has been "single and multi-threaded floating point performance".
It's the only comparison that makes sense in that context IMO, because the PS4/XBOne's Jaguar CPU performance and power consumption are set in stone.

Here's the tweet I commented on, just for context:
 
Back
Top