Can iPad Pro out-game an XB360? *spawn

Overall though, it still seems to me that the A9x falls somewhere in the middle between the XB360 and the XB1.
Agreed. I would say it is roughly in the middle (or maybe a little bit closer to X1). However it is hard to make a valid comparison (using actual shipped games) since console games are specifically designed for a single hardware, while mobile games tend to target lower spec to get bigger audiences.

The big form factor of iPad Pro gives it quite a bit of thermal headroom. It delivers solid performance over longer periods of gaming. Phones with high performance SOCs start to throttle after a few minutes if you fully stress all the CPU cores (esp 8 core Androids) and/or push the GPU heavily (not gapping the frame rate to 60 and letting the GPU to idle).

iPhone throttles quite heavily in CPU heavy tasks (even when the GPU is fully idle). You lose roughly 20% of the performance in 10 minutes of heavy use: http://arstechnica.com/apple/2014/09/iphone-6-and-6-plus-in-deep-with-apples-thinnest-phones/3/. Games stressing both the CPU and the GPU concurrently throttle more. Phones will likely not catch up with current consoles before the next two process shrinks. Tablets certainly will in a few years.
 
Wake me up when a digital foundry comparison of a game on 360 runs stably on an iPad Pro with equal or better settings, resolution and frame rate. Bandwidth heat latency missing GPU shading and texturing power and OS overhead have so far kept a lot of theoretical power very ... Theoretical.
 
Wake me up when a digital foundry comparison of a game on 360 runs stably on an iPad Pro with equal or better settings, resolution and frame rate. Bandwidth heat latency missing GPU shading and texturing power and OS overhead have so far kept a lot of theoretical power very ... Theoretical.
iPad Pro has around 3x GPU flops compared to Xbox 360 (or double that if you count fp16), 2x mem BW and much higher texturing rate as well. OS overhead is of course higher, but Metal significantly improves things compared to OpenGL. A console based on iPad Pro hardware would certainly beat Xbox 360 by a wide margin.
 
All that power is probably being used for sketching apps. or mobile games.

If not console-quality games, what would be the killer apps. for those? They demo'd loading a large AutoCAD model but that is strictly niche.
 
iPad Pro has around 3x GPU flops compared to Xbox 360 (or double that if you count fp16), 2x mem BW and much higher texturing rate as well. OS overhead is of course higher, but Metal significantly improves things compared to OpenGL. A console based on iPad Pro hardware would certainly beat Xbox 360 by a wide margin.

Again, show me the money - I mean - games.
 
Again, show me the money - I mean - games.
The sole reason of my reply was to fix the technical misinformation (about iPad vs Xbox 360 shading and texturing performance). If Sony or Nintendo made their next handheld with this exact hardware we would surely see better looking games compared to Xbox 360.
 
I think it's the ridiculous resolution of mobile devices being the primary reason the games don't appear to look as good as 360 games.

Then again, I picked up a Galaxy S7 Edge earlier this week and I'm astonished by how amazing some games seem to run on it. Mobile technology is really amazing these days.
 
I think it's the ridiculous resolution of mobile devices being the primary reason the games don't appear to look as good as 360 games.
Yes. Xbox 360 = 720p = 921k pixels. iPad Pro = 2732x2048 = 5595k pixels. That's 6x difference. You would need 6x faster GPU to reach parity in pixel quality if both rendered at "native" resolution. Thankfully Sony equipped PS Vita with a lower resolution display instead of the 2048x1536 pixel display found in the first retina iPad (that had similar hardware).
 
New I think it's the ridiculous resolution of mobile devices being the primary reason the games don't appear to look as good as 360 games.

No one is stopping developers from rendering at 720p or lower in the ipad, though.
 
True, but you don't need native res for anything save maybe UI and text. Render 900p or 1080p (equivalent aspect) and upscale. Don't a lot of games do this anyway?
 
True, but you don't need native res for anything save maybe UI and text. Render 900p or 1080p (equivalent aspect) and upscale. Don't a lot of games do this anyway?

AFAIK, most 3d games on tablets and smartphones already do that, yes.


720p on an iPad (pro or not) would look shockingly bad, considering the distance a user tends to use the device. No thanks.
720p for some games looks great in my Surface Pro 4...
Not long ago, the PS3 and X360 were doing 600p on >40" screens..
 
The sole reason of my reply was to fix the technical misinformation (about iPad vs Xbox 360 shading and texturing performance). If Sony or Nintendo made their next handheld with this exact hardware we would surely see better looking games compared to Xbox 360.

Ok fair enough! Though there the comment holds that it then wouldn't be hampered as much by temperature or battery, and that's part of my skepticism with these things, never mind that the 360 isn't really cutting edge technology anymore either of course. So I can't get too excited about this kind of news until I see something I actually want it for.
 
Unfortunately, they're not really comparable. GFXBench for Android makes use of FP16 for which the Rogue 7 series have a large portion of dedicated hardware, whereas the Windows version only uses FP32 shader effects.

Do you have a quote from Kishonti or a statetemnt that verifies the FP16 story or is it just your gut feeling? Because the first candidate that murders your theory is if you take for example G6200 scores and compare them to G6230 scores. The first hasn't any dedicated FP16 SPs, the latter has.

No Rogue is cutting any corners and those cores that contain additional FP16 SPs can either use a SIMD for FP16 or FP32.

There's a substantial difference in shader output (Tegra X1's Maxwell 2.5 does twice the FP16 throughput than it does with FP32, using the same ALU resources) and in required memory bandwidth.

No it doesn't; you can get 2*FP16 from a single FP32 on the X1 GPU under conditionals like instructions being identical. The X1 GPU is roughly twice as fast vs. the GK20A in K1 because:

1. Maxwell ALUs are more efficient (see desktop for that)
2. X1 has 2 clusters vs. 1 cluster in K1
3. The X1 GPU clocks at 1GHz vs. ~850MHz for the K1 GPU

Exception for the last being the Pixel C where the X1 GPU is clocked at 850MHz but io and behold is no longer twice as fast as the K1 GPU. Where am I missing the "difference" FP16 is supposed to make on that one?

Considering how ALU bound Manhattan3.0 and even more so Manhattan3.1 is the X1 GPU should actually waltz all over the A9X GPU because:

X1 GPU:
256 SPs @ 1GHz = 512 GFLOPs FP32 or 1024 GFLOPs FP16
A9X GPU
[strike]768[/strike] 384 SPs @ 0.47GHz = 360 GFLOPs FP32 or 720 GFLOPs FP16

Instead:

https://gfxbench.com/result.jsp?benchmark=gfx40&test=631&order=score&base=gpu&ff-check-desktop=0

https://gfxbench.com/result.jsp?benchmark=gfx40&test=545&order=score&base=gpu&ff-check-desktop=0

As impressive as the A9X performs for a tablet chip, its actual performance should be way behind the 7770 - and the XBone, for that matter.

Not that I really care because a ULP mobile chip for the time being lacks severely in bandwidth, but apart from that "way behind" is a wild exaggeration for either/or. I don't even have a clue how many z/stencil units they have on GT7600, but since Apple mirrored that one for the A9X it might have a high enough amount to get MSAA at the smallest possible memory and bandwidth cost and that not just 4x samples if you want to get anal about wasting resources.

Here's an 11 year old reminder from Wavey regarding the former on Xenos and of course geometry processing differences to a DR: https://www.beyond3d.com/content/articles/4/5
 
Last edited:
True, but you don't need native res for anything save maybe UI and text. Render 900p or 1080p (equivalent aspect) and upscale. Don't a lot of games do this anyway?
Only high contrast areas (such as edges) need native (300+ DPI) rendering. It's stupid to brute-force render all pixels at equal quality on retina resolutions. In the future we are going to see smarter techniques.

A good example of variable rate shading (using a custom ordered grid MSAA pattern): http://www.pmavridis.com/research/coarse_shading/

Techniques that reconstruct the native resolution image from multiple (temporally jittered) lower resolution images are also going to be used a lot. If I understood properly, Quatum Break also reconstructs the 1080p output image from four 720p frames.
 
Do you have a quote from Kishonti or a statetemnt that verifies the FP16 story or is it just your gut feeling?

From Anandtech:

In this benchmark the iPad Pro quite handily beats the Surface Pro 4, but it's important to keep in mind that the Surface Pro 4 is running a higher level of precision and that the iPad Pro is running OpenGL ES rather than OpenGL in this test, so it isn't strictly apples-to-apples (nor is such a thing truly possible at this time).

There's also this post from OlegSH. Though if we can summon @AlexV , he'll probably take away all the doubts.

No it doesn't; you can get 2*FP16 from a single FP32 on the X1 GPU under conditionals like instructions being identical. The X1 GPU is roughly twice as fast vs. the GK20A in K1 because:
I'm not sure what you're disagreeing with. Perhaps you misread my post?
I wasn't comparing the X1 to K1's GPU. I simply stated that a single ALU in X1 can either do one FP32 operation or 2*FP16 operations, as long as it's the same operation:
for X1, FP16 operations can in certain cases be packed together as a single Vec2 and issued over a single FP32 CUDA core.



Not that I really care because a ULP mobile chip for the time being lacks severely in bandwidth, but apart from that "way behind" is a wild exaggeration for either/or.
"Way behind" as in the A9X is definitely closer to 50% the sustained performance of a PC with a HD7770 or Xbone than it is from 100%.
According to this picture, IMG themselves would consider ipad pro's 12-cluster Series 7XT to have the equivalent performance of a Geforce GT 730M:

LW7Fk0t.png


GT 730M uses a GK208 with 2*SMX. 32 TMUs, 8 ROPs at 700MHz for ~550 GFLOP/s.
Both the 1GHz HD7770 and the Xbone top 1.3 FLOP/s with over twice the fillrate of a GT 730M.
I'd call that way behind.
 
From Anandtech:

There's also this post from OlegSH. Though if we can summon @AlexV , he'll probably take away all the doubts.

We've had that FP16/PSNR rubbish- throw everything into one pot again in similar discussions here in the forum. For the record the benefits of having FP16 are bigger in terms of power consumption and less in terms of performance in these cases IMHO.

Allwinner A80, G6230@533MHz (64 FP32 SPs, 96 FP16 SPs)
TRex offscreen: 20,60 fps
Manhattan 3.0: 8,60 fps
Manhattan 3.1: 3,90 fps

Mediatek HelioX10T, G6200@700MHz (64 FP32 SPs)
TRex offscreen: 27,10 fps
Manhattan 3.0: 10,20 fps
Manhattan 3.1: 4,90 fps

Care to show me what I am missing and the 6230 isn't at least on par with the 6200 despite the >30% higher frequency?

I'm not sure what you're disagreeing with. Perhaps you misread my post?
I wasn't comparing the X1 to K1's GPU. I simply stated that a single ALU in X1 can either do one FP32 operation or 2*FP16 operations, as long as it's the same operation:

Here's your exact quote:

Tegra X1's Maxwell 2.5 does twice the FP16 throughput than it does with FP32

I thought you said "2.5x".

"Way behind" as in the A9X is definitely closer to 50% the sustained performance of a PC with a HD7770 or Xbone than it is from 100%.
According to this picture, IMG themselves would consider ipad pro's 12-cluster Series 7XT to have the equivalent performance of a Geforce GT 730M:

LW7Fk0t.png


GT 730M uses a GK208 with 2*SMX. 32 TMUs, 8 ROPs at 700MHz for ~550 GFLOP/s.
Both the 1GHz HD7770 and the Xbone top 1.3 FLOP/s with over twice the fillrate of a GT 730M.
I'd call that way behind.

Hold it; here's the problem: I'm having the XBox360 in mind and you mean with "XBone" the XBoxOne. If you'd follow my crap through the years you'd know that I avoid comparisons like that unless I insert a footnote that it's a DX10 vs. a DX11 unit. Look above I mentioned Xenos/C1 and that in terms of compliance/functionality is the closest console GPU you can get to the A9X GPU.

Ignore the marketing rubbish of that diagram; I think but am not sure that the GT7900 comes with DX11.2. Despite it's 16 clusters and let's say clocked at 600MHz it could see cases where it embarasses the XBoxOne GPU, but even there I wouldn't believe in a true all around winner even if I'd see it. For that you'd rather need IMHO an unannounced 8XT 16 cluster config @DX11.2 and again a frequency not lower than 600MHz. DX11.x needs significantly more die area compared to vanilla DX10.0, which sounds impossible that it will not impact temperatures, power consumption, throttling etc.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Outside the above: who said that the A9X doesn't throttle at all? I find it impossible that even Apple's GPUs don't throttle in the worst case up to 20% under constant heavy usage. If anyone should look at long term performance results in Gfxbench, please look again since it's running TRex onscreen meaning vsynced at 60Hz. When offscreen scores are in TRex as high as on the A9X it's fairly impossible to detect or define any possible throttling or lack thereof. Kishonti seems to have now a Manhattan 3.1 long term performance test which should solve the former headache for some time.
 
Last edited:
IPad Pro has a 128-bit interface to LPDDR4 running at 3200MHz interface or 51.2GB/s, as compared to the XB360 22.4GB/s. XBox1 is at 68.2 GB/s.
Additionally, in terms of bandwidth saving technologies, the PowerVR GPU is in a good place.

Reviews (example) fail to demonstrate throttling on the iPad Pro. Which is amazing.
You are leaving essentially half of the Xbox One's memory system out of the equasion by not including the ESRAM that holds the g buffer. Modern console games, GBuffers are generally small in size but use a cosmic level of memory bandwidth. So really Xbox effective memory bandwidth is +150GB/s and even as high as +200GB/s in some alpha bending operations.
 
You are leaving essentially half of the Xbox One's memory system out of the equasion by not including the ESRAM that holds the g buffer. Modern console games, GBuffers are generally small in size but use a cosmic level of memory bandwidth. So really Xbox effective memory bandwidth is +150GB/s and even as high as +200GB/s in some alpha bending operations.
True, but because A9X has an internal tile buffer (with much higher bandwidth) you can't really do a straight comparison.
 
Back
Top