Nintendo Switch Technical discussion [SOC = Tegra X1]

Further proof that TX1 is bandwidth-limited is how the iGPU in Tegra X2 simply got moderate clock increases while memory bandwidth increased 2.3x.
It doesn't prove anything.
TX2 is a SoC targeted at auto market, therefore it should have this bandwidth to handle many cameras and other automarket specific use cases, this has nothing to do with GPU being bandwidth limited, actually, even mobile benchmarks are much better indicator.

Some trusty source here wrote here that a Cortex A57 can consume as much as 5GB/s, but let's assume it's only ~3GB/s because of their 1GHz clocks, so we're left with 21.3-12 = 9.3 GB/s for the GPU
Exynos 7420 shows 5 GB/s in multi-core Copy test and 10 GB/s in multi-core Bandwidth test of Geekbench 4, the last one is essentially hand-vectorized version of of STREAM test from Geekbench3, cut the results in half and this is most likely what one can get out of four 1 GHz A57 cores, actually the number should be even smaller considering that only 3 cores are available in games
 
Eh... When you compare Pascal vs Maxwell shouldn't you take this into account?
Taking into account these images, improvements happen mostly on polygon edges, suddenly, almost every benefiting game in the list has MSAA support, which works only on polygons edges. I doubt such improvements are crucial for mobile GPU
 
erm... so was the X1. e.g. fp16 was for both deep learning & power consumption.
X1 was targeted for tablets and phones.
X2 with 2 Denver and 4 A57 cores is clearly not targeted for phones and probably even for tablets, at least there is no evidence in neither marketing materials or announced devices
 
I'm perfectly aware of DrivePX, the first one was done mostly for research, are there any autonomous cars with DrivePX 1?
 
Why does it have to be in a product to validate whether the chip was designed (partly) with automotive intentions?

But ok...
https://blogs.nvidia.com/blog/2015/01/06/audi-tegra-x1/


1 - Your pascal desktop cards don't have to share memory bandwidth with CPU cores.
What would be curious is to see how bandwidth contention affects Tegra (as there's no LLC shared between the CPU/GPU as far as I can tell?)

e.g. let's say docked mode places a higher burden on bandwidth, how badly does the CPU get choked, and then vice versa. etc.
 
Last edited:
Where is this coming from?
I'm a "Playstation Gamer" but I certainly don't dismiss my PC with two R9 290X cards, 10-core Ivybridge Xeon and 64GB quad-channel RAM, in a curved 3440*1440 screen.
And yet I do recognize the best-looking game I've ever seen is Uncharted 4 running in my 4K HDR TV through a PS4 Pro.

Its coming from the primary argument that is often made against Switch is that its not as powerful as the PS4 and X1, and the sentiment that playing anything with less fidelity than those consoles is simply out of the question. Basically when people try to create their own standards, and then impose them onto other people. You may find the Switch to be weak ass hardware, and by the same token a PC gamer might find the PS4 to be weak ass hardware. The Switch's trump card is being able to play portably. By narrowing down the comparison to simply as a home console, the Switch stacks up rather poorly, but how well do the X1/PS4 and PC do as a portable. Its not fair to criticize them based on this criteria right? Well I say its also unfair to only grade the Switch based on its least favorable metrics.
 
TX2 is a SoC targeted at auto market, therefore it should have this bandwidth to handle many cameras and other automarket specific use cases, this has nothing to do with GPU being bandwidth limited, actually, even mobile benchmarks are much better indicator.
For all we know, the TX2 may be just as automobile-focused as TX1.
TX2 hasn't reached the end of its life, so you still don't know exactly what devices it's going into. Smartphones are obviously out of the question, but I don't see why it couldn't fit a tablet or a high-end set-top-box. In Max-P mode it would most probably surpass the Shield TV or any other ARM SoC in every possible measure (gaming, web browsing, app performance, etc.).

If the TX2's sole purpose was to get into Drive PX2, then there wouldn't be an upgraded video decoder with 12bit HDR capabilities and hardware for 3x display outputs. There would hardly be any need for a fully-featured iGPU with geometry modules and ROPs, either.

In the end, marketing materials tend to be focused on to a particular audience. You wouldn't expect the Drive PX2 slides to talk about how great Parker is for mobile gaming, would you?
 
There is nothing wrong with being a value minded consumer, and I think the PS4 and X1 offer tremendous value for those looking for a good gaming system to play games on their TV. If your going to use graphics as your primary argument criticizing the Switch, and do not game on a PC, its hypocritical, because PC offers superior graphics to the PS4/X1. We simply do not know what the western third party support will look like, and should wait till this fall to see what if any major AAA games are being ported. As for 360/PS3 ports struggling? What ports are you referring to?

resident evil 5, and metal gear rising run like crap on shield tv, dragonquest is a ps3 port, also runs like crap in handheld mode and docked. also were not talking about graphics, nobody expected that to match ps4/xb1, but the specs are just too low for a home console in 2017, and sorry it won't get many ports even if it's a success for 2 main reasons, the specs and thirdparty games have a history of not doing well on nintendo consoles.
 
resident evil 5, and metal gear rising run like crap on shield tv, dragonquest is a ps3 port, also runs like crap in handheld mode and docked. also were not talking about graphics, nobody expected that to match ps4/xb1, but the specs are just too low for a home console in 2017, and sorry it won't get many ports even if it's a success for 2 main reasons, the specs and thirdparty games have a history of not doing well on nintendo consoles.
But that's with the large overhead that comes with the android OS and let's be real those ports were probably done as fast as possible. On the flipside, Doom 3 runs much better on the shield than it does on 360, at a full 1080p. Not that doom on 360 and ps3 was a good port either.

I agree Nintendo could've done better, but let's not judge the console's performance based on its launch games. Games should be looking and running a hell of a lot better in a year. It won't be getting games like Witcher 3, But indies and smaller scale games will flock to it. If sales really pick up, i'd say it'll get a few bigger games here and there.
 
Those games are running on Android with Shield TV and not a low level API, Sebbbi went over this in the speculation thread in detail.

(From Sebbbi)This is the first time we see modern Nvidia GPU with their own custom low level API. This is a huge difference compared to Tegra K1/X1 seen in Android tablets with OpenGL ES. I expect significantly better GPU and CPU utilization. It's also worth noting that low level APIs on shared TDP devices (integrated GPU) are a much bigger deal compared to desktops, since reduced CPU usage leaves more TDP to the GPU. Intel showed nice iGPU performance improvements at DX12 launch, just by reducing the CPU power usage.

Nintendo Switch on the other hand is a closed platform. Fixed hardware setup. Low level API that exposes all features. Technology will be designed and optimized directly to it. These are huge advantages over common Android devices. Mobile games are also often power optimized. Frame rate is limited in order to prevent the device getting too hot and running out of battery too quickly. Nintendo Switch on the other hand is a pure game device, designed to run at full clocks during long game sessions. You can't really compare it to common Android devices.

My educated guess is that Switch is much easier to develop than last-gen consoles (Xbox 360, PS3, WiiU).

Reasons:
1. Switch has a modern OoO CPU. Last gen CPU code was horrible. Lots of loop unrolling to avoid in-order bottlenecks (no register renaming). Lots of inlined functions, because no store forwarding hardware -> 40+ cycle stall for reading function arguments from stack. No direct path between int/float/vector register files (through memory = LHS stall). Variable shift was microcoded (very slow). Integer multiply was very slow. There was no data prefetching hardware. Developer had to manually insert prefect-instructions (even for linear array iteration). ~600 cycle stall if prefetch wasn't used for L2 cache miss (and no OoO to hide any of it). Code was filled with hacks to avoid all these CPU bottlenecks.
2. Switch has a modern GPU and unified memory architecture. Last gen had either EDRAM or split memory (PS3). Always had to fight to fit data to fast GPU memory (256 MB on PS3 was largest, but that was all of it). Switch 4 GB unified is going to be life saver. In addition Maxwell has delta color compression and tiled rasterizer to automatically reduce memory bandwidth usage (last gen consoles had no such fancy hardware). Maxwell also has compute shaders, allowing more flexibility and efficiency to rendering techniques.
 
But ok...
"Audi confirmed today that it will use the new mobile superchip in developing its future automotive self-piloting capabilities." - exactly what I was talking about, TX1 is for research

e.g. fp16 was for both deep learning & power consumption.
These FP16 are clearly for gaming, it would take huge amount of time to train decent network on mobile GPU. Parker on the other hand has dp4a instruction useful for inference, something which really makes sense for auto market

For all we know, the TX2 may be just as automobile-focused
TX2 is automobile-focused, TX1 is not.

TX2 hasn't reached the end of its life, so you still don't know exactly what devices it's going into
I know that it has not been advertised for tablets. Sure they can sell TX2 for tablets or produce new tablet by themselves, this doesn't change the fact that chip was designed for auto market with its requrements, thus it would likely be suboptimal for other use cases/niches

If the TX2's sole purpose was to get into Drive PX2, then there wouldn't be an upgraded video decoder with 12bit HDR capabilities and hardware for 3x display outputs.
NVIDIA sells chips for media systems in cars as well, which can drive several displays

There would hardly be any need for a fully-featured iGPU with geometry modules and ROPs, either.
UI with lots of semitransperent surfaces on several displays can consume a lot of fillrate and bandwidth, so ROPs are mandatory here

You wouldn't expect the Drive PX2 slides to talk about how great Parker is for mobile gaming, would you?
I would expect much more cost efficient chip for mobile gaming, TX2 is not about mobile gaming bang for the buck
 
Last edited:
So is Nvidia making another gaming central mobile chip in the future?

Even if it's not aimed for gaming the upcoming tegra chip with Volta (Xavier) will apprently have twice the cuda cores compared to TX1/TX2, so I guess we should expect a nice increase in gaming performance over those two. They are still marketing it as a SoC aimed for autonomous vehicles though.
 
Those games are running on Android with Shield TV and not a low level API, Sebbbi went over this in the speculation thread in detail.

yea i know this, but the clock speeds are much faster on the shield tv, especially the cpu. my point is, some last gen ports struggling, which is hardware thats 10x weaker then ps4 doesn't bode well for switch ports, and developers really have there work cut out for them,.
 
yea i know this, but the clock speeds are much faster on the shield tv, especially the cpu. my point is, some last gen ports struggling, which is hardware thats 10x weaker then ps4 doesn't bode well for switch ports, and developers really have there work cut out for them,.

We know that the Shield TV does throttle from test over at Gaf, so the clock speeds aren't exactly as much higher on Shield TV as once thought. Also, if hardware utilization sucks, clock speeds aren't going to make up for that. On top of that, how much time do you think these developers spent on ports going to a Android TV box that has sold around 35k units in its lifetime? Very little, and I think you know that.

I don't disagree that porting to Switch will not be a cut and paste scenario. This is where the business side of things take over, and its investment for return. If Switch is selling well, they are likely to fork out a little cash for a port to see what the return on investment looks like. I think publishers will likely try and decide what makes sense. Battlefield might be incredibly hard to port to Switch and isn't likely to sell big numbers, so it doesn't make sense to do it. However, what about Garden Warfare? Much easier to port and far more likely to strike a chord with the traditional Nintendo crowd.
 
If they were promised a custom version (https://blogs.nvidia.com/blog/2016/10/20/nintendo-switch/) , would you say Nintendo "got Nvidia'd?"
Nintendo probably goes for proven, reliable designs though. Like how in the Hubble Telescope they use a PlayStation 1 processor instead of a Pentium 2 processor.

This is good news though, at one point they might allow the smaller cores to be used; for a Tetris type game maybe so that you have 30 mins of extra playtime or something
the problem is that customising a Chip requires personnel, time and money. And Nintendo has decided not to make it, simply.


Making a custom chip requires a higher initial investment, since it involves having to hire staff to handle it, along with the people of Nvidia. It also supposes a greater initial investment since it is necessary to have a production line for you and only for you.

Nintendo were looking at their accounts. And they decided it was not worth it. Then putting Nvidia Tegra X1 as it is, it would become cheaper in the short and medium term.

And that's fine, while it doesn't detract from the fact that it could be super customised, and that there are other consoles like PS4 models, and Xbox One that have a crappy CPU, but it was a proven design back then.
 
Back
Top