Nintendo Switch Technical discussion [SOC = Tegra X1]

Is API really known ? It's a nVidia API developed for the Switch right ? I guess it's pretty close to Vulkan but tweaked to exploit nVidia hardware the best way possible.

BTW, where is CUDA right now ? Can it help doing some nice stuff in a very efficient way for a gaming platform ?
Judging by conformancy tests Switch supports Vulkan, OpenGL and NVN.

Context Switches are expensive, unlikely CUDA is feasible and needed. Compute shaders could be used.
 
Nvidia said the Tegra X1 is custom, so please don't shoot the messenger.

People should not judge a system by its early ports: look at the PlayStation3.

Switch graphics can and will only go up from here

Not trying to shoot messenger, but the only thing thing custom about it is the downclocks. we have die shots that confirm it's the same chip as the 2015 shield tv.

the playstation 3 is a silly comparison, games have to be made in hand held mode, which only has 190 gflops, what graphics improvement are you really expecting? when developers really have maxed out similar specs on 360/ps3. i see little room for improvement, other then being 1080p when docked and slightly better AA
 
You undersell what new GPU architectures can achieve via programmability and efficiencies. 190 Gflops of 2015 Maxwell GPU can make prettier pictures than 190 Gflops of 2005 G7x GPU. Switch graphics can and will improve over launch titles, especially over games that clearly aren't using it to its best like Splatoon 2.
 
I get that people wanted better specs, but I'm not too pessimistic about how well a good port can turn out. I was very impressed seeing Rise of the Tomb Raider running on the Xbox 360, and if that's an indication of how Switch versions can turn out, it'll be fine for me.

Even for a game like GTA V, the PC version can still run on an 8800GT.
 
You undersell what new GPU architectures can achieve via programmability and efficiencies. 190 Gflops of 2015 Maxwell GPU can make prettier pictures than 190 Gflops of 2005 G7x GPU. Switch graphics can and will improve over launch titles, especially over games that clearly aren't using it to its best like Splatoon 2.

I have no doubt it can do better graphics, but i'm not expecting a big difference. it's gonna take games with big budgets and a talented studio to noticeably look better uncharted 3, got of war, halo 4, and killzone 3 graphics in the handheld mode.
 
Depends what you mean by 'look better'. In terms of art style, execution, image quality, etc., titles may not be as big and luxurious as the AAA's you mention but they could still be very pretty and prettier than what the same budge could have achieved on PS3. By a long chalk given PS3's complexities.
 
The tricky bit is that the games Naughty dog, SSM and Guerrilla make are aesthetically very different from Nintendo's. Maybe the next Metroid will make for a more apples to apples comparison :)

Comparing games of a more cartoon aesthetic, Wii U had already shown better results than Ps3 usually at double the framerate. Not in terms of poly counts and the like, they're pretty comparable there but in terms of shading, lighting and textures and such. When Nintendo said they maxed out Wii U with Mario kart 8, I can believe it. There's always more you can squeeze from a console but they probably couldn't get much more from it. Also consider it's still locked at 60fps in 2 player split screen.

I'm sure they'll make some very impressive games on Switch as time goes on. As if Wii U ports are the best it can do ;)

The gap between 7th gen and switch should be very palpable as time goes on, esp. at 60fps gaming since the bandwidth won't be as much an issue. I'd be shocked if splatoon and Mario were 720p while docked by launch.
 
Last edited:
I don't get your " bandwidth won't be as much an issue." The bandwith is pretty low on the Switch, and maxwell compression techniques are not magic stuff... You can even make the case that the WiiU with 32mb of edram can be in a better situation bandwith wise...
 
You undersell what new GPU architectures can achieve via programmability and efficiencies. 190 Gflops of 2015 Maxwell GPU can make prettier pictures than 190 Gflops of 2005 G7x GPU. Switch graphics can and will improve over launch titles, especially over games that clearly aren't using it to its best like Splatoon 2.

The Switch launched at the very last minute of the "Fiscal 2016" that was promised to the investors back in 2015. The reason Nintendo gave for the Switch coming so late even though TX1 has been on mass production for 2 years is because they wanted time to get a proper launch lineup.
At the same time, the reason some have been saying here for Nintendo to go with TX1 is because Nintendo wanted to use "proven hardware". Well there's never been a console using hardware more proven than a 2 year-old chip AFAIK, at least not for the past 20 years.

So Nintendo is using 2+ years hardware, on top of that the console was released as late as Nintendo could possibly do, then the launch lineup is tiny, and the games coming much later this year from Nintendo don't look fantastic either.
Using 2 year-old hardware on a very late launch didn't result in graphics that are substantially better than what we've seen in the Shield TV. So how long until these improvements come to light?

What if BotW already is very close to the very best the handheld Switch can do?


I'm just saying you can't have the cake (Nintendo releasing late and using old hardware to take full advantage of said hardware on day one) and eat it too (graphics on Switch games improving dramatically over time).
 
I don't get your " bandwidth won't be as much an issue." The bandwith is pretty low on the Switch, and maxwell compression techniques are not magic stuff... You can even make the case that the WiiU with 32mb of edram can be in a better situation bandwith wise...
It's an issue, it should have had an 128 bit bus. What I said was - with Nintendo prioritizing 60fps, bandwidth will be less of an issue for them. Relative to a 60fps console game, you need more bandwidth for the extra detail you can achieve at 30fps. At least that's my understanding, anyone chime in if i'm wrong.

I don't think i'm mistaken ; there's evidence of this already.

On one hand we have Zelda, which looking at the gpu clocks when docked should definitely be capable of going from 720p to 1080, but it only hits 900p because of all that extra detail like the grass, with more frame dips. Meanwhile Mario kart 8 is a 60fps game and does go from 720 to 1080 while docked. And irons out the frame pacing of the Wii U version too.
 
Last edited by a moderator:
197 FP32 GFlops at 384 MHz to be precise.

Right I forgot about the GPU clock boost!

It's an issue, it should have had an 128 bit bus. What I said was - with Nintendo prioritizing 60fps, bandwidth will be less of an issue for them. Relative to a 60fps console game, you need more bandwidth for the extra detail you can achieve at 30fps. At least that's my understanding, anyone chime in if i'm wrong.

So what you mean is that by setting 60FPS as the target, that means that developers will be less ambitious with graphics, ergo less demand for memory bandwidth as a result?
 
197 FP32 GFlops at 384 MHz to be precise and 394 FP16 GFlops in handheld mode, recent frostbyte presentation mentions 30% perf gains with FP16 on hand tuned checkerboard resolve shader, so at least some non memory bound shaders might benefit a lot of 2x FP16 math rate. Nobody stops devs from using much lower res than 720p in handheld mode, with such small screen, checkerboard rendering should be perfectly doable and even MSAA coarse shading should work fine.

Exactly, and its baffling to still see people referencing 150Glfop for Switch. Like pointed out above, its actually 197 Gflop full precision. We are talking a 25% short change here. Not only that, but the complete dismissal of half precision shaders is getting old as well. We know from post Sebbbi made in the past that many shaders work great with half precision, so the 394Gflop portable and 786Gflop docked 1/2 precision metrics are worth acknowledging. Not only does Maxwell accomplish more per flop than the PS3/360/Wii U, but it has the ability to use half precision shaders. This isn't a silver bullet, but a significant advantage the processor has over the previous generation consoles.

During the Nintendo presentation at GDC, they made mention that the porting of Zelda BoTW to Switch was done with no optimization. Again, no optimization, just a straight forward port, and they were still able to up the resolution to 900p docked. Just like early PS4/X1 cross generation games, these early Switch games are not indicative of the maximum potential of the Switch.

With that said, Switch is still a good deal weaker than PS4/X1. So expectations of what a console roughly twice as powerful as a Wii U/360 can do need to be grounded. It took quite a while for PS4/X1 to really look next gen compared to 360/PS3, and those consoles are significantly more powerful than Switch. In terms of performance, it kind of is a half generation leap over the 360/PS3/Wii U, and that just isn't going to bring on the same level of improvement that the PS4/X1 have delivered. People often criticized those consoles for not offering a true next gen experience and they are an order of magnitude more powerful. So yea, I think most games will ultimately look like buffed up 360/Wii U games. That's great for the small screen, and less impressive on the big screen. Nintendo's first party games already looked rather nice on Wii U, so I am confident they will be even more refined on Switch as we go forward.
 
We know from post Sebbbi made in the past that many shaders work great with half precision, so the 394Gflop portable and 786Gflop docked 1/2 precision metrics are worth acknowledging. Not only does Maxwell accomplish more per flop than the PS3/360/Wii U, but it has the ability to use half precision shaders. This isn't a silver bullet, but a significant advantage the processor has over the previous generation consoles.

Tegra X1 can use half precision but as far as we know Maxwell can't use async compute in any sort of productive means. sebbbi said around 70% of the pixel shaders in his games could be done in FP16. How much real-world performance does this translate into? How does it compare to Mark Cerny's claim of 50% GPU performance boost by using async in GCN GPUs?
Besides, there are limits to using FP16 in Maxwell 2.5, you can't assume it'll simply perform 2x faster in each FP16 operation. It probably won't, because e.g. both FP16 calculations in each ALU need to be doing the exact same operation and you don't know how effective the scheduler will be at distributing this work.

Do you know how one feature compares to another regarding practical performance? I don't. Probably, few people do.
And async has been in use for over 3 years by all major developers for the PS4Bone. Who's been using FP16?

As great as half precision is, you can't just pretend there's a boost to 400GFLOPs in handheld mode that will make them equivalent to PS4Bone GFLOPs.
 
Besides, there are limits to using FP16 in Maxwell 2.5, you can't assume it'll simply perform 2x faster in each FP16 operation. It probably won't, because e.g. both FP16 calculations in each ALU need to be doing the exact same operation and you don't know how effective the scheduler will be at distributing this work.

Isn't that basically how GPU's work? GPU's are optimized for taking huge batches of data and performing the same operation over and over very quickly. So yes, I do believe that utilizing a FP16 shader instead of a FP32 shader would result in a significant speedup where applicable. If more than half of your shaders can be done in half precision, your shader code will see a significant speed up. This is just one area of the rendering pipeline, so not its not a 2x speed up, but again, its a big advantage the Switch has compared to previous gen consoles.

Who's been using FP16?
Anyone using Unreal 4 on mobile. Its not supported on the base PS4 and X1, but I believe Cerny did talk up FP16 with the PS4 Pro.

Well there's never been a console using hardware more proven than a 2 year-old chip AFAIK, at least not for the past 20 years.
What are you talking about? Most consoles are built based on tech at least two years old. GCN graphics cards were launching in 2011, and the HD7850 that the PS4 is based on released in March 2012, 19 months prior to PS4's release. Wii U was using 2008 R700 architecture. 360 and PS3 were really the only ones using cutting edge tech.

Also, if you can show me a mobile SOC that thumps the TX1 in a product less than $299 please link it.
 
I'm just saying you can't have the cake (Nintendo releasing late and using old hardware to take full advantage of said hardware on day one) and eat it too (graphics on Switch games improving dramatically over time).
So you're saying the world is awash with TX1 optimised engines? That UE and Frostbite and Unity and numerous in-house engines had considerable investment in maxing out TX1 for the many millions of Shield consoles out there? Not to mention all Nintendo's Wii U ports.

There has never been a console or piece of hardware in history that hasn't seen notable gains within a few years of release. Even ancient hardware like C64 and Spectrum have shown new tricks as devs have spent more time on them. Even the PS Vita, which launched in 2011 using an SGX543MP processor when the SGX543 released nearly three years earlier.
 
Tegra X1 can use half precision but as far as we know Maxwell can't use async compute in any sort of productive means. sebbbi said around 70% of the pixel shaders in his games could be done in FP16. How much real-world performance does this translate into? How does it compare to Mark Cerny's claim of 50% GPU performance boost by using async in GCN GPUs?
Besides, there are limits to using FP16 in Maxwell 2.5, you can't assume it'll simply perform 2x faster in each FP16 operation. It probably won't, because e.g. both FP16 calculations in each ALU need to be doing the exact same operation and you don't know how effective the scheduler will be at distributing this work.

Do you know how one feature compares to another regarding practical performance? I don't. Probably, few people do.
And async has been in use for over 3 years by all major developers for the PS4Bone. Who's been using FP16?

As great as half precision is, you can't just pretend there's a boost to 400GFLOPs in handheld mode that will make them equivalent to PS4Bone GFLOPs.

yea i'm no expert by any means, but based on what i been reading we can a expect a 30% boost at best if a developer really took advantage of FP16 on switch.
 
How does it compare to Mark Cerny's claim of 50% GPU performance boost by using async in GCN GPUs?
This is kind of nonesence thing to do, sure, it's quite easy to make shadowpass run for 16 ms on xbox with its 16 rasterized pixels per clock, and we all know that GCN also sucks at geometry reach scenes, so there are plenty of holes to use, but async has disadvantages as well, you cannot run memory bound kernels in parallel with memory bound graphics without performance losses, there are tons of constraints and all this stuff will result into chip being power constarined because suddenly graphics hardware + CUs in parallel will consume exactly the same amount of energy as if they were done serially, thanks to clock and power gating, one can simply overclock Maxwell to keep it on par without any async shaders.
 
Back
Top