SUBSTANCE ENGINE

It is quite strange to have a 1.75GHz CPU run slower than an "equivalent" 1.6GHz one for the same code.
Has further information been disclosed about the engine that supports the assumption that the same code is running?
The same overall task is being accomplished, but there are constant concerns over compiler usage, optimization settings, and different host systems when comparing benchmark scores in other situations.
 
Has further information been disclosed about the engine that supports the assumption that the same code is running?
The same overall task is being accomplished, but there are constant concerns over compiler usage, optimization settings, and different host systems when comparing benchmark scores in other situations.

In any case, we can safely say Xbox one's CPU does not have a clear advantage over the PS4's. The 1.75 Xbox one cpu is indeed faster than a 1.6 Xbox one cpu :)
 
Has further information been disclosed about the engine that supports the assumption that the same code is running?
The same overall task is being accomplished, but there are constant concerns over compiler usage, optimization settings, and different host systems when comparing benchmark scores in other situations.

I assumed it was his reasonable effort on both platforms. It shouldn't be that big a performance difference.
 
I assumed it was his reasonable effort on both platforms. It shouldn't be that big a performance difference.

That means we have source and system functions of unknown equivalence run through tool sets of unknown equivalence, or it's all in assembly and the platforms don't vary in their requirements.

This is skipping over some variables that are large enough to make tens of percent in difference.
It would be highly debatable that a desktop CPU benchmark is going to be considered a measure of the actual CPU if it is run on Windows and compiled through ICC while the other run is on Linux and using GCC.
 
We haven't been able to account for whether the code that CPU is running is fully equivalent due unknown code or algorithmic differences, or the known OS and tool set differences. The requirements for that code or its system functions related to the platform and OS are probably not equivalent, even if the code between those functions is for some reason identical.

Worse disputes over results have come from smaller discrepancies.
 
How does Visual C codegen compare to Clang/llvm assuming default optimization settings ?

System calls will touch the OS services, which may give an edge to PS4 if it's leaner and simpler.
 
If this were a generic comparison, some rather oldish numbers would give an edge to VS.
The custom work for the PS4 and rumored rawness for the Xbox One in the early stages give me pause.

Additional custom features may or may not be involved for the performance numbers in this thread, I haven't seen it stated either way. For an architectural comparison, equivalent optimization or feature usage would be desired, but not so much if the goal is to sell software that can leverage platforms to their fullest.
 
How does Visual C codegen compare to Clang/llvm assuming default optimization settings ?

oh come on.
Are you comparing an over 20 year old compiler which is used to compile the winkernel since x86 was THE ONLY platform to anything else?

llvm isnt bad at all, yet VS makes better code.

...for the rest, remember that the xbone runs over an hypervisor, so you lose 2-4% on average (only M$ knows the exact figure, given they know how they vmcall/leave).
Also, if the memory access pattern isnt random, thus allowing the memory to not precharge at every memory access or so, GDDR5 will kick DDR3 in the a$$.
CPU prefetchers can take care of the rest.
 
Last edited by a moderator:
I don't know what MS call their latest compiler for XB1. ^_^
I would think it's at least comparable to llvm. It should not slow XB1 down.

The more complex OS run-time would have bigger overhead, although PS4 OS is newly put together too.
 
Also, if the memory access pattern isnt random, thus allowing the memory to not precharge at every memory access or so, GDDR5 will kick DDR3 in the a$$.
The memory interfaces for both consoles far exceed the CPU section's ability to use them, or is there something besides bandwidth you have in mind?
Most of the other parameters don't appear to be significantly different.
 
or is there something besides bandwidth you have in mind

...mostly the fact that, if data pattern is well predictable (and maybe it also lets kick in the CPU prefetcher...), it *might* allow to have much faster/more processable data in L2, which might yield noticeable speed differences.

In the end, as long as your data pattern is predictable and doesnt require you to issue a preload command often, GDDR5 is able to pump up much more data than DDR3.

...about the compiler: the OS architecture doesnt matter - the backend/frontend of the compiler is 99.99% the standard vs one, since it compiles x86 code. Fileformat and other amenities are matter of the linker, at very best.

So, for sure, MS must have the best optimizing compiler.
 
Slide 5, 6 and 10 are pretty interesting.

It is quite strange to have a 1.75GHz CPU run slower than an "equivalent" 1.6GHz one for the same code. We will need to know how the measurements are taken (e.g., Is it measuring performance when another app is snapped and Kinect is running ? If not, what're the numbers in these cases ?).


Wait.. Jaguar is only 10% slower than an i7 at the same clock?

So, and my math here is obviously flawed, PS4 CPU has equivalent performance to an i7 quad @ 3.2Ghz (subtract 10%)* roughly when all is said and done? Providing, of course, they multithread the crap out of whatever they're doing on it.

Isn't that pretty impressive?


*8 cores running at 1.6Ghz
 
...mostly the fact that, if data pattern is well predictable (and maybe it also lets kick in the CPU prefetcher...), it *might* allow to have much faster/more processable data in L2, which might yield noticeable speed differences.

In the end, as long as your data pattern is predictable and doesnt require you to issue a preload command often, GDDR5 is able to pump up much more data than DDR3.
This doesn't expand the bandwidth of the northbridge connection to the CPU clusters. Both the GDDR5 and DDR3 interfaces have bandwidth far in excess of what the link to the CPU modules can transfer. Only the GPU is able to manage it.
If that bandwidth were the deciding factor this case, the Xbox One should have an edge.

...about the compiler: the OS architecture doesnt matter
I didn't say it mattered to the compiler.
 
Wait.. Jaguar is only 10% slower than an i7 at the same clock?

So, and my math here is obviously flawed, PS4 CPU has equivalent performance to an i7 quad @ 3.2Ghz (subtract 10%)* roughly when all is said and done? Providing, of course, they multithread the crap out of whatever they're doing on it.

Isn't that pretty impressive?


*8 cores running at 1.6Ghz

No. Perhaps at that particular task/code (audio?), maybe. Current estimates would place the console jaguars around the performance of an i3 (dual core) at 3.2Ghz, which would make an ivy bridge core approximately 2x faster on a clock for clock basis. This of course depends on what you're doing...could be significantly slower (than 2x) or faster than 2x depending on task, as you can see from substance engine chart, though we don't know the stats of the i7 they tested with.
 
Wait.. Jaguar is only 10% slower than an i7 at the same clock?

So, and my math here is obviously flawed, PS4 CPU has equivalent performance to an i7 quad @ 3.2Ghz (subtract 10%)* roughly when all is said and done? Providing, of course, they multithread the crap out of whatever they're doing on it.

Isn't that pretty impressive?


*8 cores running at 1.6Ghz

If you want some first hard evidence of how i7's compare with Jaguars then I'd suggest you go check out some A4-5000 benchmarks which is a 4 core Jaguar at 1.5ghz and is generally competitive with dual core ulv Ivybridges at similar clock speeds
 
This doesn't expand the bandwidth of the northbridge connection to the CPU clusters. Both the GDDR5 and DDR3 interfaces have bandwidth far in excess of what the link to the CPU modules can transfer.
...hmmm officially? I dont recall anything officially said by Sony about the link between "NB" and the CPU.

my sentence on compiler is to rule out compiler from equation.
MS compiler's optimization are surely better than anything else, just because... well, they do it sine 20 years and they did learn how to optimize C code in x86 pretty well nowadays.

Imho the only difference you may have, are:

1) PS4 CPU clock >> XBone clock: very unlikely, due to TDP waste to go beyond 1.75Ghz
2) XBone Hypervisor/OS remapping cores/threads like on Intel CPUs (not respecting 4+4 clustering of L2).
3) XBone Hypervisor kicking in many times due to some strange memory allocation scheme; unlikley... hopefully for them.
4) PS4 CPU can take advantage of GDDR5 superior speed.
 
Last edited by a moderator:
...hmmm officially? I dont recall anything officially said by Sony about the link between "NB" and the CPU.
This depends on whether you consider the Vgleaks diagrams showing a link to the CPUs at around 20 GB/s as being legitimate. Some off-hand comments by Sony indicate they consider it accurate.

Other indicators are the bandwidth for the Onion/Onion+ bus, which is 20 GB/s.
Durango's coherent link to the GPU is also equivalent to the memory bandwidth they've given for the northbridge that the CPU accesses go through.


my sentence on compiler is to rule out compiler from equation.

MS compiler's optimization are surely better than anything else, just because... well, they do it sine 20 years and they did learn how to optimize C code in x86 pretty well nowadays.
The compiler optimizes as well as it can with the flags that were selected, not that we know what they are or if they are equivalent for both platforms.
Code generation does improve after the introduction of a new microarchitecture, and an early period of performance growth when the architecture is profiled in earnest has been routine for new x86 chips on the desktop. The console APUs haven't physically existed in their final form for all that long.

The rumors were that there were some bugs being worked out and some performance regressions a little later in the process than expected, so I wouldn't expect either platform's code generation is going to be where it will be when the tools mature.
 
There is a question of how old is this data. Apparently, the substance engine has been available for the Xbox One since May 22, which could mean early dev kits were involved. And the engine has been available on the PS4 for a longer time than the XB1.

http://www.awn.com/news/allegorithmic-s-substance-engine-now-available-xbox-one
The data is either recent or still relevant today. The multiplatform dev at neoGAF corroborates the article, saying "you can get more out of the PS4 CPU".
 
Back
Top