PS4 & XBone emulated?

But the CPU and GPU components can work together very efficiently. This is a new paradigm to some extent, but similar to what happened with the more advanced titles last gen. The total system efficiency is what counts here, and the true bottlenecks have traditionally been manipulating large data streams.

It's a bit blunt, but couldn't you say that as soon as your calculations, whatever they are, are bottlenecked on the CPU, they are likely to benefit from massive paralelisation? I'm sure that you'll find exceptions, but maybe not as many?

And especially in these times, where the CPU has been an unreliable, inefficiently used quantity on PC versus the GPU (whether through gpu driver overhead, comparatively high-latency two-way communication, separate memory pools, etc.), it seems to me that the resources are more likely to be used in the GPU context than the CPU context in the first place.

I'd love to hear about a real-life bottleneck.
 
I think this statement lacks some key details, I seriously doubt a single threaded code could run faster on a 1.75GHz core than a 3.2GHz core. Obviously the code is limited elsewhere here, possibly it is not as hand-tooled as it it could be.

The point then was that their existing production optimized x360 code was easily running faster on jaguar per thread, so straight ports of components could be made without additional threading.
However, it could have something to do with this: http://beyond3d.com/showpost.php?p=1601066&postcount=35 ie that the Xenon is essentially behaving like a 1.6ghz in-order 6 core - which should also be quite a bit slower than the "6 core" XB1.

Can't find the numbers now, but IIRC the 2ghz kabini is running pretty much like a 2ghz conroe quad.
 
Some people here making the mistake of thinking the X360 CPU was fast. It was trash - absolute utter garbage. The new console CPUs will run rings around it, just as much as typical desktop CPUs will run rings around them.
 
Baring in mind that any PC games that may be ported to the new consoles would likely be current gen console games at heart I can't see them giving the new console CPU's much bother. The code may need to be re-written to take better advantage of multiple threads but if it was possible to run it on Xenon/Cell is would certainly be possible to run it on the jaguars. Enhancements of PC versions over current gen console versions usually focus on the GPU side rather than CPU.
 
Can't find the numbers now, but IIRC the 2ghz kabini is running pretty much like a 2ghz conroe quad.

well, the consoles run at 1.6 and 1.75GHz, but it should be close to a Conroe at a similar clock I guess, but that's hardly good enough for a PC CPU anymore, specially because Conroe was running at a higher clock most of the time.

just for fun I compared my crippled (E2140) Conroe at the same clock as Kabini on single thread Cinebench 11.5 (perhaps not the best tool? but it's what I have) and...
1.5GHz Kabini = 0.39 points
1.5GHz E2140 = 0.40 points

keep in mind the e2140 was downclocked a little and with less than half the memory speed.

but yes, Jaguar single core performance is way to low for a gaming PC, but... it's being used on a closed platform, with 8 of these cores... so it's a different story.
 
The general trend in the game development is to profile integer, memory intensive workloads and complex serial code on the CPU side and shift all/most of the parallel SIMD/FP parts to the GPU, incl. pure compute tasks. Address space unification will further emphasize on this model.
 
X360 has 3 cores with 6 threads .. it's OS was a much less of a performance hog than XO. All of that and we didn't touch on the PS3's Cell of course.

Besides merely matching an 8 years old CPU is not that big of a deal at all, once again developers will focus more on the rendering side rather than the simulation side, and tradeoffs will be made. We expect next-gen titles to have complex mechanics, bigger physics simulations, more stuff happening on the screen at once .. well , we won't be having any of that, not more than what was achieved on the PC anyway.

In fact this CPU point will be the downfall of this so called next-gen ... it turns out, it is not so next-gen at all, it's just playing catch up with 2 years old PCs, burdened by the resolution and complex OS requirements .. some don't expect it to last that long because of that.




I'm not an expert, but your statements are completely disparate to every single testimony from game developers I've seen so far.
Everyone seems to agree that the OoO functionality and much higher IPC of the Jaguar core can more than compensate for the difference in clock speeds.

Just as an example, each Jaguar core has been estimated to take almost 100M transistors. The entire Xenon (3 dual-threaded PowerPC cores) has 163M. The eight Jaguars alone take almost as many transistors as 5 entire Xenons. Not even taking into account that ~9 years of technological progress must have evolved the performance-per-transistor over all architectures, the fact that a single Jaguar core is much larger than the PowerPC cores in Xenos is a good indicator that IPC won't be anywhere near the same.

It seems that all you're doing is taking wild assumptions based on clock speeds and theoretical limits alone, which isn't realistic by any means.
 
Last edited by a moderator:
The general trend in the game development is to profile integer, memory intensive workloads and complex serial code on the CPU side and shift all/most of the parallel SIMD/FP parts to the GPU, incl. pure compute tasks. Address space unification will further emphasize on this model.

Wouldn't that leave the SIMD units underutilized though? Especially on PC which are quite heavy in that regard these days? Or would you tend to keep more of the SIMD code on the CPU for PC development due to the latency of going over PCI-E?
 
The point then was that their existing production optimized x360 code was easily running faster on jaguar per thread, so straight ports of components could be made without additional threading.
However, it could have something to do with this: http://beyond3d.com/showpost.php?p=1601066&postcount=35 ie that the Xenon is essentially behaving like a 1.6ghz in-order 6 core - which should also be quite a bit slower than the "6 core" XB1.

Can't find the numbers now, but IIRC the 2ghz kabini is running pretty much like a 2ghz conroe quad.

The old claim that it's each core is like 2x1.6GHz cores is wrong. If you look at the Cell PPE datasheet (XBox 360's CPU should be similar) you'll see that it has a few different thread interleaving modes. IIRC they only affect the front end.

But that doesn't change that the PPE's typical average IPC was very poor. It's due to a lot of factors beyond just lacking OoOE. Like pretty narrow/limited multi-issue capabilities, two-cycle ALU latency, poor branch prediction/high mispredict penalty, a big bubble even on predicted taken branches, high L1/L2 cache and memory latencies, lack of hardware prefetching, a large cache line size, and no store to load forwarding plus a big penalty on load hit store.
 
I'm not an expert, but your statements are completely disparate to every single testimony from game developers I've seen so far.
Everyone seems to agree that the OoO functionality and much higher IPC of the Jaguar core can more than compensate for the difference in clock speeds.
..

It seems that all you're doing is taking wild assumptions based on clock speeds and theoretical limits alone, which isn't realistic by any means.
I see we've moved from "better than last-gen phase" to "can compensate phase", that's not what we should be talking about when we mention next-gen, to truly have a tangible next-gen gaming experuience both CPU and GPU fronts should be more advanced than before .. instead what we have here is an underpowered CPU that is barely any better than last-gen, coupled with a powerful GPU. which means we will have better graphics but not necessarily better gameplay or experience, innovation will be quite limited.

And I am not echoing my own voice here, just the words of other developers who expressed the same to Eurogamer's secret developer program , and I quote:

Both the consoles have Jaguar-based CPUs with some being reserved for the OS and others available for the game developers to use. These cores, on paper, are slower than previous console generations but they have some major advantages. The biggest is that they now support Out of Order Execution (OOE), which means that the compiler can reschedule work to happen while the CPU is waiting on an operation, like a fetch from memory...


In fact I recommend everyone to read this excellent article to get a better grasp on how things will develop in this console cycle.

I'd love to hear about a real-life bottleneck.
http://www.eurogamer.net/articles/d...ware-balance-actually-means-for-game-creators
 
And I am not echoing my own voice here, just the words of other developers who expressed the same to Eurogamer's secret developer program , and I quote:

When the author says that the cores are "slower on paper" that refers to nothing more whatsoever than the lower clock speed. In some sense that really would make it slower, even if it completes the same or more work in the same time frame.

The limited amount of code that actually delivers decent throughput on XBox 360 or PS3's PPE cores is generally well suited for GPGPU execution. Even more so for the PS3's SPE cores. For more general code a 1.6GHz Jaguar will easily match if not exceed the 3.2GHz PPE, the perf/MHz is really that bad. Couple in the fact that there over twice as many cores on XB1/PS4 than there were on XBox 360 and you get something far beyond just delivering similar CPU performance.

Consider Wii U. It has only 3 CPU cores without SMT. These cores have inferior perf/MHz on scalar code (and much worse on vector code, especially vector integer) and lower clock speed than PS4/XBox 360. They also have little leftover ALU resources to utilize for GPGPU. And yet it's usually able to achieve something close to parity with PS3 and XBox 360 games. PS4/XB1 should therefore be able to greatly exceed them in CPU-driven capability.
 
I see we've moved from "better than last-gen phase" to "can compensate phase", that's not what we should be talking about when we mention next-gen, to truly have a tangible next-gen gaming experuience both CPU and GPU fronts should be more advanced than before ..

No one moved any phase.
My words were that each Jaguar core can more than compensate for the lower clock speeds for each PPE core. From all I know, 1 Jaguar core can do more than 1 PPE core.
And then the other difference is that for games there are 6 Jaguar cores versus 2 PPE cores.



(...)
instead what we have here is an underpowered CPU that is barely any better than last-gen, coupled with a powerful GPU. which means we will have better graphics but not necessarily better gameplay or experience, innovation will be quite limited.
(...)
And I am not echoing my own voice here, just the words of other developers who expressed the same to Eurogamer's secret developer program , and I quote:

Yes, you are echoing your own voice.
To my knowledge, no developer ever claimed that the CPU is barely any better than last gen, because it's simply not true. That article says no such thing.

It's like you're stuck to the Pentium 4 era where most people thought more MHz = more performance.
 
I wonder what would have happened if the CPU was as beefed up as the GPU on next-gen. Would the consoles then cost $1000?
 
When the author says that the cores are "slower on paper" that refers to nothing more whatsoever than the lower clock speed.
I think it is more than that, the only thing the author listed as advantageous for the new consoles is the OoOE as well as other minor architectural enhancements.
For more general code a 1.6GHz Jaguar will easily match if not exceed the 3.2GHz PPE, the perf/MHz is really that bad. Couple in the fact that there over twice as many cores on XB1/PS4 than there were on XBox 360 and you get something far beyond just delivering similar CPU performance.
Well, I guess actual games will be the judge of that, right now it does seem like the majority of next-gen flare in cross-platform games is stemming from the better GPU (ie the rendering side) we will see down the road if the simulation side is catered for.

I also noticed that most points in this discussion focused only on the X360 CPU, ignoring the PS3's Cell completely, which doesn't bode well for the new consoles. unless there are other bits I am not aware of.

Consider Wii U. It has only 3 CPU cores without SMT. These cores have inferior perf/MHz on scalar code (and much worse on vector code, especially vector integer) and lower clock speed than PS4/XBox 360. They also have little leftover ALU resources to utilize for GPGPU. And yet it's usually able to achieve something close to parity with PS3 and XBox 360 games. PS4/XB1 should therefore be able to greatly exceed them in CPU-driven capability.
I really think the Wii U is a bad example, games there look and run pathetically, while some of them come close to PS3/X360 they usually lack AA/AF , scale back on textures, shadows and other features, run with horrendous 20ish fps .. etc.

Many developers actually gave up making any games on the Wii U at all, due to it's bad performance. if it was so close to the PS3/X360 that wouldn't have happened.

To my knowledge, no developer ever claimed that the CPU is barely any better than last gen, because it's simply not true.

For argument's sake, let's consider they are not really that bad compared to last-gen. However, even with that when you have a GPU that is 10 times better than last-gen, while the CPU is only 2 times better, then you have a CPU that is barely faster than last-gen.

That article says no such thing.

But let's not kid ourselves here - both of the new consoles are effectively matching low-power CPUs with desktop-class graphics cores.
"In this console generation it appears that the CPUs haven't kept pace... which means that we might have to make compromises again in the game design to maintain frame-rate."
first round of games will likely be trying to be graphically impressive (it is "next-gen" after all) but in some cases, this might be at the expense of game complexity. The initial difficulty is going to be using the CPU power effectively to prevent simulation frame drops and until studios actually work out how best to use these new machines, the games won't excel
 
From all I know, 1 Jaguar core can do more than 1 PPE core.
It's like you're stuck to the Pentium 4 era where most people thought more MHz = more performance.

argh... you've beaten me in answering time... :oops:

ah, I'd take a jaguar over the PPE any time. And likely any PS3 developer I guess...

the PPE was thought to be no more than the scheduler for SPEs, in the end.
...and some game company didnt even use it for that, as it was too slow even for that :p

Just a question - are CPUs today 10x faster, IPC clock per clock, than CPUs of 2006? As long as CPUs can keep up doing their job and offload to GPGPU, there's no need of anything more powerful.
 
I think it is more than that, the only thing the author listed as advantageous for the new consoles is the OoOE as well as other minor architectural enhancements.

He's dumbing it down for the audience. Nowhere did he actually make an estimation of what the difference in perf/MHz is like, he just said it's different. OoOE is not at all a dominating factor compared to all those other things I listed. You can find in-order CPUs that have much better perf/MHz than the PPE (Intel's Saltwell, ARM's Cortex-A53 will be an even better example), they're even still dual issue. You can't ignore all the serious performance problems these CPUs have when not dealing with very data regular code.

Well, I guess actual games will be the judge of that, right now it does seem like the majority of next-gen flare in cross-platform games is stemming from the better GPU (ie the rendering side) we will see down the road if the simulation side is catered for.

Yes, it takes time to develop these things. I remember people saying similar things when this last gen started.

I also noticed that most points in this discussion focused only on the X360 CPU, ignoring the PS3's Cell completely, which doesn't bode well for the new consoles. unless there are other bits I am not aware of.

Because a) XBox 360 very usually did as well as better than PS3, there aren't really showcase examples where PS3 games had this huge advantage in logic, it's largely because the SPEs were quite domain specific/limited in the kind of work they could do and b) like I said, the stuff SPEs were good at is mostly stuff GPGPU can do now. In fact, much of what the SPEs were being used for wasn't just GPGPU friendly code but actual graphics, a lot of stuff that was even being done by the GPU on XBox 360.

I really think the Wii U is a bad example, games there look and run pathetically, while some of them come close to PS3/X360 they usually lack AA/AF , scale back on textures, shadows and other features, run with horrendous 20ish fps .. etc.

Many developers actually gave up making any games on the Wii U at all, due to it's bad performance. if it was so close to the PS3/X360 that wouldn't have happened.

Look and run pathetically? I don't know what you're reading, most games that DF analyzed look about the same (some with better textures or more post-processing) and run somewhere between the PS3 and XBox 360 versions. A few have particularly notable issues, but a couple others run consistently better.

For argument's sake, let's consider they are not really that bad compared to last-gen. However, even with that when you have a GPU that is 10 times better than last-gen, while the CPU is only 2 times better, then you have a CPU that is barely faster than last-gen.

GPUs have always gotten faster at a more dramatic rate than CPUs. They also hit diminishing returns with those improvements faster, at this point you need a lot more GPU power to really translate into a much nicer looking game. They're not really uniformly 10x faster in every way anyway.

pMax said:
Just a question - are CPUs today 10x faster, IPC clock per clock, than CPUs of 2006?

No, of course not, especially if you're starting with Conroe (which was a 2006 release) and not Prescott. And it doesn't change much if you look at peak single threaded performance and not just IPC. The consoles really did have exceptionally bad perf/MHz, but even then I doubt you'd hit 10x that.

People groan about the consoles not being fast enough but they couldn't have really done that much better. Intel was never an option. Maybe if they pushed a Piledriver core to the limit they could have gotten a little over 2x the single threaded performance, with somewhat of a hit in threaded scaling. They couldn't have some monster APU with 4 PD modules and keep 12-18 GCN CUs, that would have been > 500mm^2 for sure and probably a huge hit to yield. So they'd probably need to something with 2 modules, but at the clock speeds needed to get similar multithreaded performance they would have used a ton more power, which they're likely also pushing the limits of right now. This is also a crappy situation if you still want to give up an entire core of even two for the OS.

Face reality, there's a reason both companies went with what they did.
 
Last edited by a moderator:
in terms of overall ST performance an i5 Haswell ($200) is probably something like 4x the PS4 CPU performance for current PC software right?
 
in terms of overall ST performance an i5 Haswell ($200) is probably something like 4x the PS4 CPU performance for current PC software right?

Depends...

Here's a post from sebbi for reference.

http://forum.beyond3d.com/showthread.php?p=1759251#post1759251

You could say that a dual core 1.6 GHz Ivy Bridge (turbo disabled) equals quad core Jaguar (Kabini) at same clocks, and a quad core 1.6 GHz Ivy would likely equal a eight core (double Jaguar CU) Kabini model at the same clocks. Dual core 3.2 GHz Ivy would roughly mach a quad core 1.6 GHz Ivy when theoretical throughput is considered. However doubled core clocks increase the memory latency from CPUs point of view. The CPU must wait twice the clocks cycles to get the data from the memory, assuming memory speed stays intact. Because of this, the 3.2 GHz model wouldn't provide linear gains over the 1.6 GHz model. But neither does a quad core provide linear gains over a dual core (multithreaded scaling is never 100% efficient).

Clock for clock you could say a haswell core is just over 2x faster than a jaguar core, theoretically speaking. So a quad core haswell at double the frequency would be about 2.5x faster than the 8 core jaguars at 1.6GHz...?

In terms of emulation, raw performance is there when considering each part separately but like others have said, HSA features and tight cpu/gpu integration could be be tough to emulate.
 
Clearly the GPU part was a priority in both designs, single chip solution too. That left a too shallow option list for CPU choice, where Jaguar was virtually the only suitable candidate -- small, tightly integrated, modern/popular ISA support and OoO pipeline (mostly because of the load/store op's boost).
 
Back
Top