XBox 360 Emulators for the PC

Itanium is an odd one to bring up in this context as it was a design as much driven by business decisions as anything else. Intel was never hot on extending x86 to 64 bits as they wanted to move commodity servers to Itanium and restore their hegemony in commodity servers. AMD extended x86 and commodity servers users refused to pay the massive costs in refactoring their software (or completely redesigning in most cases) for VLIW and Itanium. Intel caved, adopted AMD64 (yay cross-licensing agreements) and competed the right way by making a better x86-64 architecture.

At that point Itanium was an expensive, bespoke architecture that Intel wasn't willing to spend billions on anymore as most customers for expensive, bespoke server architectures already had a huge investment in their rivals tech and little appetite for the s/w engineering required to move. Intel's obvious disappointment with the Itanium project led to rumours of its early demise which either became self-fulfilling when customers refused to invest in a 'risky' platform or just true depending on how you look at it.

I brought up Itanium because there were well known stories of demonstrations showing how much raw parallelism would be available. They'd bring out tiny snippets of hand written code and say that this is the potential compilers could achieve, while those who knew better had serious reservations.

64-bit was a minor component of Itanium compared to everything else that was different about the uarch. Really they were staking on being able to go really wide efficiently by having in-order/exposed state VLIW with a lot of software-based speculation. It all relied on a lot of sophistication in the compiler that didn't materialize, and underestimated how useful some things were.

It's a CPU where the designers had expectations for the software that weren't met. Not sure what else you should read into the comparison.

OT I thought the Power cores in both designs were fairly similar bar the differing approaches to vector units (1core + SPUs versus 3 cores w/AltiVec)?

They were, hence why I said Microsoft poached the Cell PPE.

They both had AltiVec anyway, Xenon just had more registers and a little more in the way of execution resources and special instructions.
 
IPC is not a measure of bandwidth. If a system is primarily bandwidth constrained, that restriction would be considered separately, even if it is technically true that the chip will stall if it has a lot of memory traffic pending.


The common case is that it does--until it doesn't. There has to additional non-dependent instructions available for the chip to execute out of order. This isn't always the case.
Retranslation, recompilation, or refactoring has to happen in cases where the CPU's finite reordering capabilities do not help, or the problem itself doesn't give enough extra work to reorder.


IPC in this case is a measured value, you don't find out what it is until it exists to be measured to get a true reading.
It was not, however, that much of a surprise.
IBM, at least would have seen it coming to a reasonable degree of accuracy.
Thanks for the explanations, your posts are always very interesting 3dilettante. :smile2: Why isn't it a surprise? IBM created an inferior CPU in comparison to what was expected, or do CPUs have a low IPC in general when they are created?

I mean, my PC has an Intel i5-2450M CPU, and maybe it has a lower IPC than people expect when they are solely focusing on the potential numbers. Is that a common occurrence?

Additionally, don't you believe that the low IPC of the X360's CPU and an in-order vs and out-of-order CPU would ease the creation of a running emulator and smooth the development process?
 
No CPU "has" any IPC. It has some particular average IPC while performing some particular task. All CPUs are better at some tasks than others. PS3 and XBox360 had CPUs that were, for their time, very good at code that could be expressed a certain way. Sony expected that to fit a lot of game code, and IBM made what Sony wanted. That's really all there is to it.
 
Thanks for the explanations, your posts are always very interesting 3dilettante. :smile2: Why isn't it a surprise? IBM created an inferior CPU in comparison to what was expected, or do CPUs have a low IPC in general when they are created?
The Xenon and PPE cores were focused on being small, clocking high, having low complexity, and providing high vector throughput.
All the things designers use to raise IPC, wider issue, enhancements to load/store, branch prediction, OoOE, optimized memory hierarchy, and other features that increase complexity and work per cycle were either removed or severely compromised.

There's a massive list of things that can be done to raise IPC, and due to design specifications and/or a lack of money/time, they were pretty much all not done.

IBM has been designing high-performance cores for decades, it would have been trivial to understand how low IPC would get.
The design deliberately sacrificed IPC and it was cheap. The intent was to count on clocks, SMT, and multicore scaling to make up for that.

Additionally, don't you believe that the low IPC of the X360's CPU and an in-order vs and out-of-order CPU would ease the creation of a running emulator and smooth the development process?
In some instances, no.
Low IPC cores don't have major drops in performance on code that does not have many parallel instructions. Raw clock speed can be dominant there.
High IPC cores do drop in performance, and can be inferior if they are clocked slower.
Jaguar is not a particularly high IPC architecture, so it's not enough to just not suck as much as Xenon.

In the vast majority of cases, a more rounded architecture will do better, but it's the few places where the fragile architecture excelled where straightline speed or peak vector performance matters.
Optimizations that work around weaknesses in Xenon can bloat the code, and there are per-instruction costs to various forms of emulation. There are optimizations in modern OoO cores like loop buffers that can be broken by code optimized for long pipelines and terrible branch prediction.
Because emulation has overhead, it can also be a matter of just not having enough of a performance lead, depending on the threshold of "good enough" for the emulation.
 
I brought up Itanium because there were well known stories of demonstrations showing how much raw parallelism would be available. They'd bring out tiny snippets of hand written code and say that this is the potential compilers could achieve, while those who knew better had serious reservations.

64-bit was a minor component of Itanium compared to everything else that was different about the uarch. Really they were staking on being able to go really wide efficiently by having in-order/exposed state VLIW with a lot of software-based speculation. It all relied on a lot of sophistication in the compiler that didn't materialize, and underestimated how useful some things were.

It's a CPU where the designers had expectations for the software that weren't met. Not sure what else you should read into the comparison.



They were, hence why I said Microsoft poached the Cell PPE.

They both had AltiVec anyway, Xenon just had more registers and a little more in the way of execution resources and special instructions.


Ahhh I see, I didn't get the drift of your point first time around thanks for the clarification. I guess I always figured the business factors were more important in the eventual death of the Itanium line and hadn't considered the technical challenges to have been as important. I always admired the elegance of the VLIW concept from what I read on it when Itanium launched. Did we just underestimate how many workloads were inherently serial (and thus a poor match for VLIW) or was it that the hopes for compilers to 'auto-magically' create highly parallel were just unrealistic?

Back OT I agree that IPC is not a very useful way to understand the complexity of modelling the PowerPC cores of the Xbox360 on the x86 cores of the XB1. Especially as PowerPC is an example of a RISC cpu architecture versus the very CISC x86 architecture (even though almost every modern x86 uses a core/uncore design to make it more RISC like). Is it even possible to compare the IPC across different architectures in a useful manner?
 
I always admired the elegance of the VLIW concept from what I read on it when Itanium launched. Did we just underestimate how many workloads were inherently serial (and thus a poor match for VLIW) or was it that the hopes for compilers to 'auto-magically' create highly parallel were just unrealistic?
It's not just inherently serial code, there's also dynamic and run-time behaviors that can be intractable for compile-time analysis, or become uneconomical because extra bookkeeping and code bloat are their own performance costs.

There was likely a general overestimation of how much optimization would or could be done, particularly after multiple implementation misfires and delays shrunk its market to being effectively an HP-specific architecture.

Is it even possible to compare the IPC across different architectures in a useful manner?
It is almost invariably a matter of extreme debate when comparing IPC when the "I" portion is non-equivalent.
 
I prefer to talk about perf/MHz over IPC, although ultimately that means nothing without comparing MHz too. I also think that the ISA advantage in x86 can be very exaggerated, when balanced against its disadvantages as well as actual execution resources.

It's true the CPUs in XB360 and PS3 were never going to get very high IPC even on the most ideal code, but could still get very high overall performance on some code with a combination of powerful instructions (real four-wide FMADD was a lot vs PCs at the time, at least) and high clock speeds.
 
I prefer to talk about perf/MHz over IPC, although ultimately that means nothing without comparing MHz too. I also think that the ISA advantage in x86 can be very exaggerated, when balanced against its disadvantages as well as actual execution resources.

It's true the CPUs in XB360 and PS3 were never going to get very high IPC even on the most ideal code, but could still get very high overall performance on some code with a combination of powerful instructions (real four-wide FMADD was a lot vs PCs at the time, at least) and high clock speeds.

Question is, in retrospect, were the Xenon and Cell successful CPUs? Looking back upon what the PS3 and 360 pulled off, I would say there were sufficiently capable but the adoption of readily available tech for the PS4 and XBone, obviously meant that Sony and MS made a mistake in focusing on capability instead of programmability/ease of use in their previous consoles.

Kind of ironic that both new platforms are still vector-focused in the form of GPGPU, especially the PS4. Perhaps the ballgame really hasn't changed all that much, but the getting in the door (making code run) has gotten easier....

I'm pretty impressed games like BF3 and Far Cry 3 are on the 360 and PS3. I was playing FC3 today on my PC, and even with mostly minimum details (DX11, ultra geometry, low everything else) the game looks quite nice and both consoles were rendering similar if not better overall visuals I would assume.
 
Question is, in retrospect, were the Xenon and Cell successful CPUs? Looking back upon what the PS3 and 360 pulled off, I would say there were sufficiently capable but the adoption of readily available tech for the PS4 and XBone, obviously meant that Sony and MS made a mistake in focusing on capability instead of programmability/ease of use in their previous consoles.

Kind of ironic that both new platforms are still vector-focused in the form of GPGPU, especially the PS4. Perhaps the ballgame really hasn't changed all that much, but the getting in the door (making code run) has gotten easier....

I'm pretty impressed games like BF3 and Far Cry 3 are on the 360 and PS3. I was playing FC3 today on my PC, and even with mostly minimum details (DX11, ultra geometry, low everything else) the game looks quite nice and both consoles were rendering similar if not better overall visuals I would assume.
I played/have seen the best of the best games in the current generation and my eyes don't bleed when I go back and play a X360 game.

Of course the difference is staggering at times, but those old games looks still fine in my eyes.

On another note, do you people think that this confirms that they are working hard on achieving Xbox 360 emulation on the Xbox One? :smile2:

http://www.linkedin.com/pub/bharath-mysore/5/132/563

Software Development Engineer in Test
Microsoft
Public Company; 10,001+ employees; MSFT

March 2012 – March 2013 (1 year 1 month) Redmond WA
Make 360 games run on Xbox One console
 
The emulator has improved a lot, and now it can run the game of the video at almost 100% speed. The name of the game in the video is A-Train HX. The emulator could run the game at full speed but it can't due to API overhead. Things will improve a lot when Vulkan or DirectX12 come out.

 
Is it really that the emulator has improved versus the PCs running it has improved?
 
Is it really that the emulator has improved versus the PCs running it has improved?
I think it's the first option, because while slowly but surely, it has improved. It could be the 2nd option but taking into account that the emulator is a recompiler and, in order for the framerate to be playable, it needs like ~1-10x the power of an Xbox 360, I think the progress is due to optimisation rather than more powerful hardware running it.

What I wonder though is why those two games are the only ones being emulated... Shouldn't the entire Xbox 360 library run on the emulator once you run a single game?
 
on the xbox era and PS2+PS1 era, lots of devs using weird stuff in their engine. Thus sometime the emulator developer need to do specific patch/work for specific game.

maybe the same thing is here
 
Is it really that the emulator has improved versus the PCs running it has improved?

First post in this thread is exactly one year old.
Not much has improved dramatically in the PC platform during the last year, except for Maxwell's perf-per-watt.
 
First post in this thread is exactly one year old.
Not much has improved dramatically in the PC platform during the last year, except for Maxwell's perf-per-watt.

Ah, i must have been thinking this was the 4 to 6 year old emulator by a similar name. Yeah, not enough advancement in pc components to do a giant leap.
 
What I wonder though is why those two games are the only ones being emulated... Shouldn't the entire Xbox 360 library run on the emulator once you run a single game?

Different games exercise different parts of the hardware (that may not yet be emulated completely or at all), are subject to different bugs in the emulator, and will have different performance. And if the emulator uses HLE it'll depend on what particular libraries the game uses.

For example, look at DS emulation. A small percentage of games don't use the 3D hardware at all. So you could do a very incomplete emulator that still emulates these games perfectly, but are missing a ton of graphics for everything else.
 
Different games exercise different parts of the hardware (that may not yet be emulated completely or at all), are subject to different bugs in the emulator, and will have different performance. And if the emulator uses HLE it'll depend on what particular libraries the game uses.

For example, look at DS emulation. A small percentage of games don't use the 3D hardware at all. So you could do a very incomplete emulator that still emulates these games perfectly, but are missing a ton of graphics for everything else.

Don't forget that many games also take advantage of and at times rely on bugs (undocumented "features") that exist in the console's implementation of the hardware.

Those can cause significant issues in getting a game to be emulated correctly (since you have to emulate the hardware bug/feature) or efficiently.

Regards,
SB
 
Don't forget that many games also take advantage of and at times rely on bugs (undocumented "features") that exist in the console's implementation of the hardware.

Maybe, but that probably wouldn't be the first order cause for all but a tiny number of games failing to work.
 
Back
Top