Nintendo Switch Tech Speculation discussion

Exophase · Feb 20, 2017

ToTTenTranz said:
I only described the specs presented in that SSH-ish console screenshot which seems believable so far (if it's a fake it's a very well done fake)

It wouldn't be a very difficult fake at all and there and this raises some questions.

I couldn't find anything about a program/package called getinfo for a familiar *nix. I could however find some example getinfo scripts for powershell. And on what sort of OS other than Windows would paths use backslashes? Anything reporting "516 cores" under CPU architecture would be pretty bizarre. nVidia or AMD might like numbers that combine CPU cores with GPU SIMD lanes for their marketing but it makes no sense in this context.

If this is legitimate it would almost certainly have to mean one of the following:

- The recently leaked disassembly pictures are fake. Which would be much, much harder than faking this terminal dump.
- The recently leaked disassembly pictures represent obsolete hardware, which seems off given that they don't look rough or incomplete at all and they're only being leaked so close to Switch's release. It'd be a really big deal if Nintendo made such a change to SoC after being this far along. I heard claims that the unit is just one "used for FCC" last August. Such certification isn't going to carry over covering a radical design change and that's very little time for Nintendo to support such a thing.
- The Shield TV 2017 revision is actually using the same SoC with 2x TX1 performance, which would be extremely odd since nVidia said that they didn't improve the internals at all.
- The shot represents something legitimately ran on Switch hardware but is misleading, like containing an incorrect calculation for GFLOPs.

bunnybug · Feb 20, 2017

ToTTenTranz said:
I only described the specs presented in that SSH-ish console screenshot which seems believable so far (if it's a fake it's a very well done fake).

What I think is the TX1 is definitely the chip in the devkits up to July, and so are the 4GB LPDDR4 1600MHz in a 2*32bit memory controller. That leaked html is the real thing.
It's the only thing I'm >90% sure.

What comes in the final production units is what I'm not sure at all.
In one hand we have Eurogamer's specs and clock values (which according to themselves are all from July's devkits) and in the other hand we have the Foxconn leaker from November who's shown a pristine track record in every single non-subjective affirmation he made so far. Every number he saw and everything he measured has been proven right to a ridiculous degree. Stuff like "OMG this is so powerful it's gonna be more powerful than iphone 8" should be taken with a huge grain of salt because he's obviously not an engineer nor an executive with access to what's actually inside the chip, and they're definitely not letting the assembly line workers load up android and 3dmark.
But the actual numbers he's provided so far are ridiculously legit.

this is false, eurogamer article say developers were recently briefed. as for the foxconn he saw the unit for sure, but he was testing a dev kit, and he also confirms 4g unit which is false, He also says he can see the two memory chips but not the model or type. Which should should be visible to someone who can see the actual components on the motherboard. why would eurogamer lie about docs from nintendo stating this will be the final specs at launch, why would somebody go through all the damn trouble of tearing apart a switch, with 32gb of ram, dev kits had 64, and a custom x1.

"originally posted by eurogamer
As things stand, a docked Switch features a GPU with 2.5x the power of the same unit running from battery. And while some questions surround the leaked specs above, any element of doubt surrounding these CPU and GPU clocks can be seemingly be discounted. Documentation supplied to developers along with the table above ends with this stark message: "The information in this table is the final specification for the combinations of performance configurations and performance modes that applications will be able to use at launch."

ToTTenTranz said:

I only described the specs presented in that SSH-ish console screenshot which seems believable so far (if it's a fake it's a very well done fake).

Click to expand...

Deleted member 13524 · Feb 21, 2017

There's a "Julia" benchmark as part of the GPUTest suite, but the windows version has a GUI and doesn't really mention compute capabilities.
Anyone here with a Linux machine willing to run that version of GPUTest and see what is returned after running Julia32?
Apparently he ran the test at 480*360.

Exophase said:
I couldn't find anything about a program/package called getinfo for a familiar *nix. I could however find some example getinfo scripts for powershell. And on what sort of OS other than Windows would paths use backslashes?

Is it possible the backlashes are there to access the file system from the client machine, in this case the PC sending the julia.h file?
Or maybe it's just the custom OS from Nintendo/nvidia that uses it.

Exophase said:
- The recently leaked disassembly pictures represent obsolete hardware, which seems off given that they don't look rough or incomplete at all and they're only being leaked so close to Switch's release. It'd be a really big deal if Nintendo made such a change to SoC after being this far along.

Nintendo console PCBs usually have the Nintendo logo very visible. The 3DS even has it on both sides. The leaked teardown shows no Nintendo markings whatsoever.
As for the rest, there have been devkits using off-the-shelf components for a long time. Even the mobile consoles. Didn't the 3DS devkit use a Tegra 2 at some point?

Exophase · Feb 21, 2017

ToTTenTranz said:
Is it possible the backlashes are there to access the file system from the client machine, in this case the PC sending the julia.h file?
Or maybe it's just the custom OS from Nintendo/nvidia that uses it.

It doesn't make sense that it would work this way, or that the directory would be accessible as "\bin\" on the host Windows machine. It's really hard to imagine connecting to a server on a device via a typical SSH port, is named "localhost", has a pretty typical shell prompt and runs .sh scripts, but uses backslashes.

There've been some other questionable things raised in the NeoGAF thread, like "core" being misspelled in that getinfo output (reasonable for some random test script someone just bashed together for a very specific purpose but probably not something at this stage), as well as the huge difference between reported CPU and GPU temperature despite being on the same die. A near-production unit with a failing CPU core temperature readout (which would be from an on-die sensor) is also a cause for alarm, although it could be a software issue.

ToTTenTranz said:
Nintendo console PCBs usually have the Nintendo logo very visible. The 3DS even has it on both sides. The leaked teardown shows no Nintendo markings whatsoever.

Do we know we've seen every PCB in the unit? It's very compact and multi-segmented compared to 3DS. I've long suspected that nVidia is taking more of an active role in overall design than is typical for Nintendo products, which if true could have some impact on deviations from what they usually do.

I can't really see why they'd leave the logo only off of older PCBs except maybe to throw people off in the case of leaks, but it's pretty obvious that this thing is a Switch without needing the logo.

ToTTenTranz said:
As for the rest, there have been devkits using off-the-shelf components for a long time. Even the mobile consoles. Didn't the 3DS devkit use a Tegra 2 at some point?

Yes they've used various whatever for early development platforms but this looks nothing like an early platform. Have you seen a 3DS devkit using a Tegra 2 that actually looks like a retail 3DS? I doubt anything like that seriously existed, what they used probably looked more like standard Tegra 2 eval boards. A Jetson TX1 board would better suit early devkits than pushing out such a complete platform design using an old SoC.

Besides that, I've now seen that the FCC filing says straight up that it's using a mass-production equivalent prototype. This is a really important distinction because like I said before, if it's not mass-production quality you're just going to have to do it again and since the FCC testing didn't actually finish until December they flat out wouldn't have had time to do another on newer hardware. On the other hand, the unit was received by the test facility in early August. If the date codes on the leaked unit are correct and the SoC was manufactured in July that'd mean there'd be no time for it to be anything less than the unit that was FCC tested.

So it's really very probably the final unit.

BRiT · Feb 21, 2017

Or the scripts are run on the SDK Emulator that's really run on the host PC as a VM to provide fast prototyping similar to Android development?

Exophase · Feb 21, 2017

BRiT said:
Or the scripts are run on the SDK Emulator that's really run on the host PC as a VM to provide fast prototyping similar to Android development?

It's possible but I don't know why Foxcon would have devices that are so far removed from final production or testing.

CaffeinatedChris · Feb 21, 2017

So in regards to that VI terminal screenshot rumor, my thoughts.

1. As pointed out by several folks, Unix doesn't backslash. "\bin\julia.sh" will get you a curt "you're a moron" response from the OS. Who's to say this guy didn't just VT220 or SSH back into his own machine that's spewing the BS he wants to show?

2. It's certainly not the bog-standard GPUTest, but that's unsurprising. You can render a Julia fractal from a command line there, but it gives you a "score" not a result in GFLOPS. (For what it's worth, a 6 EU Sandy Bridge i3 gives you ~128FPS at 480x360 using the Mesa 10.5.9 drivers.)

3. If we're assuming the Julia fractal is roughly as difficult as a Mandelbrot to render, the CPU isn't going to help for dick. A Pixel C turns in just over 4 GFLOPS in the Mandelbrot MT test on Geekbench 3 (source)

4. The GFLOPS relative to potential core counts doesn't line up. Maxwell and Pascal are both available in multiples of 128-core SMM/SMPs - so you have 256, 384, and 512 for options. Both Maxwell and Pascal do 2 FP32 ops per cycle.

1005 x 256 x 2 = 514.56 GFLOPS
1005 x 384 x 2 = 771.84 GFLOPS
1005 x 512 x 2 = 1029.12 GFLOPS

The 875.77 GFLOPS number falls between the 384 and the 512 core values.

If we assume that the 4x Cortex-A57 cores generate about 31.57 GFLOPS, that leaves us with 844.2 GFLOPS left to reverse from the GPU, which yields 844.2 * 1000 / 2 / 1005 = 420

So in conclusion if you believe this "leak" is accurate you're probably high.

Exophase · Feb 21, 2017

CaffeinatedChris said:
The 875.77 GFLOPS number falls between the 384 and the 512 core values.

If we assume that the 4x Cortex-A57 cores generate about 31.57 GFLOPS, that leaves us with 844.2 GFLOPS left to reverse from the GPU, which yields 844.2 * 1000 / 2 / 1005 = 420

I wouldn't consider this by itself to be a massive problem, as it's really hard to come up with a benchmark that both completes a standard task and saturates a GPU's FLOP performance. You'll come short of the full amount thanks to thread divergence, excessive loads/stores (particularly in the form of register fills/spills), not having enough threads to hide latency, operations that the GPU can't run at peak rate or must run as multiple more elementary FLOPs but still get counted as single FLOPs in the analysis, memory bandwidth bottlenecks, rasterization/rendering/texturing bottlenecks, and so on and so forth.

The CPU figure also seems too conservative. AFAIK Cortex-A57 has a peak of 8 FLOP/cycle (4xFP32 NEON fmadd/fma). At the alleged 2.143GHz that would mean 68.58 GFLOPs in aggregate. But the research I've seen shows that it's hard to load balance a GPU with a much slower CPU in Julia set rendering while achieving good utilization, so I doubt the results from a test like this would be incorporating both.

That all said, while looking at these numbers I found something else that looks way off. For the benchmark to be meaningful it has to have a stable operation set, and therefore will contain the same number of FLOPs total regardless of how long the duration is. This FLOP/s number will then be reported via a measurement from performance counters, or more likely derived from an algorithmic analysis that determines the number of FLOPs needed to produce the test output. What this means is that the FLOP/s to FPS ratio should be close to constant, modulo some very small roundoff and maybe measurement error, and that's true even if the clock speed changes during the test. But the ratios we see reported here are far from equal, it's 0.71 undocked vs 0.92 docked. That makes no sense and looks like numbers that were either doctored or made up entirely.

The claim that it was performed at 480x320 also seems very weird, why would anyone use that resolution with this device? Except it makes a lot of sense if they only realized after releasing the screenshot that the FPS was way too high for an expected resolution like 1920x1080 or 1280x720.

Another thing that I don't get is what vist in the vim32 directory is supposed to be. I can't find any reference to it. I'd expect vim32 to be one of the win32 ports of the popular text editor. But vist I have no idea about. What I do think is that an ssh client or similar is one of the last things anyone would bother to reinvent, making me wonder if this isn't just what some faker named their fakey fake fake script. The fun thing about said fakery is you don't even need to do a single thing with user input. This could literally be a program that just prints a bunch of crap to the screen.

CaffeinatedChris · Feb 21, 2017

Exophase said:
The fun thing about said fakery is you don't even need to do a single thing with user input. This could literally be a program that just prints a bunch of crap to the screen.

Exactly. This could literally be a PowerShell script with a bunch of formatted text.

I understand the appeal, you throw together some pure USDA choice bullshit, post it on /v/ and collect a bunch of (You)s and then see how far your story spun from whole cloth can go. It's highly amusing and I'd be lying if I said I didn't enjoy posting bait to rile people up on forums where frothing fanboys have been known to gather.

Exophase · Feb 21, 2017

I've been thinking over the scenario more, and I'm probably belaboring the point, but please bear with me.

The proposed "faster Switch" narrative right now is that Eurogamer's specs reflect an obsolete development unit that will be superseded by a much faster and more power efficient Switch on release day.

This means that, for the benefit of developers, Nintendo used a tremendous amount of engineering resources to put together a device that has the right form factor at the cost of having much worse performance.

I just have to ask.. why? Why would Nintendo think this is a good idea? Or to elaborate, in what way would a devkit that trades in realistic performance for realistic battery capacity and form factor benefit developers? We know that Nintendo can put together a Tegra X1 development platform that clocks a lot higher than 1.02GHz on the CPU side or 302MHz (or even 768MHz) on the GPU side. We know that Shield TV can do a lot better than that. It just uses a lot more power. But why would developers care? They wouldn't want to use a battery powered dev kit to begin with, nor will they care if it's bulkier or heavier than the real thing. What they will care about is making the games perform as well as they possibly can on the final hardware, and that means dev kits that are clocked as closely to the final thing as possible.

If the devkit can clock a lot higher than the final thing can, so be it - so long as you can underclock the thing to run in a production equivalent mode to do your tuning against that's fine. An optional faster mode will help make some debugging and testing tasks go more smoothly. But if the devkit is much weaker than the final hardware, that is a big problem.

And let's not surmise that Nintendo decided only in the last few months to increase the clock speeds by massive margins and move to a totally new SoC, and therefore just had no idea that it was in the pipeline. That's not how any of this works.

bunnybug · Feb 21, 2017

Exophase said:
I've been thinking over the scenario more, and I'm probably belaboring the point, but please bear with me.

The proposed "faster Switch" narrative right now is that Eurogamer's specs reflect an obsolete development unit that will be superseded by a much faster and more power efficient Switch on release day.

This means that, for the benefit of developers, Nintendo used a tremendous amount of engineering resources to put together a device that has the right form factor at the cost of having much worse performance.

I just have to ask.. why? Why would Nintendo think this is a good idea? Or to elaborate, in what way would a devkit that trades in realistic performance for realistic battery capacity and form factor benefit developers? We know that Nintendo can put together a Tegra X1 development platform that clocks a lot higher than 1.02GHz on the CPU side or 302MHz (or even 768MHz) on the GPU side. We know that Shield TV can do a lot better than that. It just uses a lot more power. But why would developers care? They wouldn't want to use a battery powered dev kit to begin with, nor will they care if it's bulkier or heavier than the real thing. What they will care about is making the games perform as well as they possibly can on the final hardware, and that means dev kits that are clocked as closely to the final thing as possible.

If the devkit can clock a lot higher than the final thing can, so be it - so long as you can underclock the thing to run in a production equivalent mode to do your tuning against that's fine. An optional faster mode will help make some debugging and testing tasks go more smoothly. But if the devkit is much weaker than the final hardware, that is a big problem.

And let's not surmise that Nintendo decided only in the last few months to increase the clock speeds by massive margins and move to a totally new SoC, and therefore just had no idea that it was in the pipeline. That's not how any of this works.

aside from all that, eurogamer has seen docs where they it says these specs will be used in launch hardware. the reality is these people just want to believe, it's like trying to explain to a lotto addict there odds, they don't care, they just want something to hope for.

Shifty Geezer · Feb 21, 2017

ToTTenTranz said:
I only described the specs presented in that SSH-ish console screenshot which seems believable so far (if it's a fake it's a very well done fake).

It's a text print out - easiest thing to fake!

What comes in the final production units is what I'm not sure at all.

But presented with the evidence, surely you then have to think about it to decide authenticity and what it means for the machine?

Okay, so you didn't think about it at the time. Now I've raised the point, what do you think is happening here? A fake because it's easy to troll? A hardware that's all GPU and no bandwidth? That the whole expected memory subsystem is wrong and there's way more BW than that?

Deleted member 13524 · Feb 21, 2017

Exophase said:
I just have to ask.. why? Why would Nintendo think this is a good idea? Or to elaborate, in what way would a devkit that trades in realistic performance for realistic battery capacity and form factor benefit developers? We know that Nintendo can put together a Tegra X1 development platform that clocks a lot higher than 1.02GHz on the CPU side or 302MHz (or even 768MHz) on the GPU side. We know that Shield TV can do a lot better than that. It just uses a lot more power. But why would developers care? They wouldn't want to use a battery powered dev kit to begin with, nor will they care if it's bulkier or heavier than the real thing. What they will care about is making the games perform as well as they possibly can on the final hardware, and that means dev kits that are clocked as closely to the final thing as possible.

Shifty Geezer said:
Now I've raised the point, what do you think is happening here? A fake because it's easy to troll? A hardware that all GPU and no bandwidth? That the whole expected memory subsystem is wrong and there's way more BW than that?

Nintendo has done stupider things in the past (e.g. everything about Wii U).
Possible explanation for having a severely underpowered devkit? Nintendo wanted devs to be able to experiment the console on its various modes (docked, undocked with coupled controllers in hand, undocked in a table with decoupled controllers) while taking notions of battery life, UI size of several elements when 2 players are looking at the 6" screen at the distance, ideas for gameplay, etc. And this would come before sheer power, at least for the first generation of games.
So yes, this would be a devkit showing the console's final physical form, for developers to think of gameplay ideas and functionality before thinking about showing the best graphics they could squeeze out of the system.

Does this sound believable coming from Nintendo?

Exophase said:
The claim that it was performed at 480x320 also seems very weird, why would anyone use that resolution with this device?

To avoid fillrate + bandwidth constraints while trying to measure compute performance?

Shifty Geezer said:
It's a text print out - easiest thing to fake!

Yes, but many things make sense, like clock speeds if it's a 16FF chip compared to the 20nm chip in the Shield TV.

I'll answer your question below so please answer me this one:
- Given TX1's specs and clocks at 20nm, you don't think nvidia could develop a gaming-specific 16FF SoC with a GPU carrying 4 SMs that clock at 400MHz undocked and 1GHz docked with active cooling? Same thing with some Cortex A72 (or even Denver?) cores that clock between 1.8GHz and 2.1GHz in undocked/docked modes respectively?
In "undocked mode" with low power and no fan, the Pixel C gets by with a 20nm SoC at 4*A57 @ 1.9GHz + 2 SM @ 850MHz for a while before throttling, but a hypothetical 16FF SoC with 4*A72 @ 1.8GHz + 4 SM @ 400MHz sounds unrealistic?
In "docked mode" with power coming directly from the wall and active cooling, the Shield TV does 2GHz CPU + 1GHz GPU for a while before throttling, but the same hypothetical 16FF SoC with 4*A72 @ 2.1GHz + 4 SM @ 1GHz sounds unrealistic?
Regarding bandwidth, I don't think Wide I/O 2 or even quad-channel LPDDR4 should be left aside for a custom gaming SoC.

Sure, the odds right now are tipping in favor of the production units bringing just a boring old TX1, but there's conflicting info about it.

For example, the specs described in this rumor/leak would be a perfect match for Unreal Engine 4's performance profiles.

And another thing that doesn't make sense is how nvidia is about to launch a Shield Console 2 (the handheld) with a Tegra X1 and 4GB LPDDR4. They filed for FCC approval in July 2016, so well after the Switch contract was made.
Nintendo is just going to let them do that and risk a direct handheld competitor from nvidia with the exact same processing hardware, possibly at a lower price point and a better portfolio at launch?

Picao84 · Feb 21, 2017

Shifty Geezer said:
It's a text print out - easiest thing to fake!

But presented with the evidence, surely you then have to think about it to decide authenticity and what it means for the machine?

Okay, so you didn't think about it at the time. Now I've raised the point, what do you think is happening here? A fake because it's easy to troll? A hardware that's all GPU and no bandwidth? That the whole expected memory subsystem is wrong and there's way more BW than that?

If you compare the bandwidth with the PC GPU equivalent of the XBOX One GPU, the HD7770, its not 1/10th but rather 1/3rd, since that GPU has 72GB/s of bandwidth. Maxwell also employ bandwidth saving techniques that a HD7770 does not, so the difference is even less.

Plus, where are you getting that 1/10th from? Are you adding both the embedded RAM and System Memory together? As can be seen by the lower quality shown by Xbox One vs PS4 version of games, that is hardly a correct statement. The System Memory bandwidth alone stands at just 68 GB/s. Switch's 25GB/s is 40% of it.

I think you are overblowing the memory bandwidth issue out of proportion. This is a console targeted at 720p mostly. 25GB/s of bandwidth is not ideal but is far from being terribly imbalanced for an 800 GFlops GPU. We are not in 2007 anymore, modern GPUs have way better memory management techniques that reduce the need for bandwidth (including Anti Aliasing techniques that rely less on memory).

Shifty Geezer · Feb 21, 2017

ToTTenTranz said:
Yes, but many things make sense, like clock speeds if it's a 16FF chip compared to the 20nm chip in the Shield TV.

the mark of a good troll is picking believable numbers.

I'll answer your question below so please answer me this one:

Not seeing your answer. :???:

- Given TX1's specs and clocks at 20nm, you don't think nvidia could develop a gaming-specific 16FF SoC with a GPU carrying 4 SMs that clock at 400MHz undocked and 1GHz docked with active cooling?

That's possible.

And another thing that doesn't make sense is how nvidia is about to launch a Shield Console 2 (the handheld) with a Tegra X1 and 4GB LPDDR4. They filed for FCC approval in July 2016, so well after the Switch contract was made.
Nintendo is just going to let them do that and risk a direct handheld competitor from nvidia with the exact same processing hardware, possibly at a lower price point and a better portfolio at launch?

I'd argue that goes the other way. Why would nVidia give Nintendo their flagship part which they want to sell their own hardware? Switch will be in direct competition with Shield. nVidia would be better off giving Nintendo TX1 and using TX2 themselves. Yet nVidia aren't using TX2 in their own handheld? Why?

Pixel · Feb 21, 2017

Guys who are considering the possibility these higher specs are true... despite the TDP/cost/die size/memorybandwidth issues ....Just look at the visuals for 1st party and 3rd party games. These higher specs are not true.

Lalaland · Feb 21, 2017

Shifty Geezer said:
Yet nVidia aren't using TX2 in their own handheld? Why?

This is the ultimate answer to the "Why TX1 in Switch?" questions, TX2 is not a mobile part it's a part designed for embedded applications such as in-car entertainment and automation at the moment. I believe there will be a part down the road that is suitable for mobility applications but right now TX1 is the best mobile part NV has.

Picao84 · Feb 21, 2017

Pixel said:
Guys who are considering the possibility these higher specs are true... despite the TDP/cost/die size/memorybandwidth issues ....Just look at the visuals for 1st party and 3rd party games. These higher specs are not true.

Following the same logic then games like Watchdogs shouldn't have been downgraded. Or compare FFXV images mere months before release. It looked terrible! Everyone was then surprised the release was so polished. Pre release imagery means nothing. I'm not saying the specs are true or not, just that they are plausible until we know otherwise.

ProspectorPete · Feb 21, 2017

While I believe the DF reported specs are correct, I can see Nintendo giving developers lower specs to work with, so that the final hardware can run the poorly optimised titles without frame drops or slowdown at 60fps locked.

Pixel · Feb 21, 2017

Picao84 said:
Following the same logic then games like Watchdogs shouldn't have been downgraded. Or compare FFXV images mere months before release. It looked terrible! Everyone was then surprised the release was so polished. Pre release imagery means nothing. I'm not saying the specs are true or not, just that they are plausible until we know otherwise.

Fair enough.

Nintendo Switch Tech Speculation discussion

Exophase

bunnybug

Deleted member 13524

Guest

Exophase

BRiT

(>• •)>⌐■-■ (⌐■-■)

Exophase

CaffeinatedChris

Exophase

CaffeinatedChris

Exophase

bunnybug

Shifty Geezer

uber-Troll!

Deleted member 13524

Guest

Picao84

Shifty Geezer

uber-Troll!

Pixel

Lalaland

Picao84

ProspectorPete

Pixel

Similar threads