Backwards Compatibility: poll

Which BC method?

  • Hardware BC via Partial redundant hardware and GPU emulation (12.5% performance hit)

    Votes: 0 0.0%

  • Total voters
    28

TheChefO

Banned
If we assume 28nm production and a die size budget for nextgen at 500mm2 (roughly the same as ps3/xb360), then incorporating the original ps3/xb360 dies in the budget would consume ~25% of the die (~120mm2). If we assume this translates to an equivalent ratio of "power", then the question is, would you as a consumer, accept a worst case scenario of 25% less "power" for full BC?

If Sony/MS can manage to emulate the GPU* of their existing consoles while including the existing redundant CPU die, the performance hit drops to 12.5%.

Or, if they can scale the existing architecture retrofitted for ps4/xb720, (cell x2, xenon x2) and emulate the existing GPUs, then the performance hit is 0%. Assuming there isn't a magical new CPU power out there that has a substantial lead on cell or xenon performance per die size.



option 1) Full hardware BC via Full redundant hardware (25% performance hit)

option 2) Hardware BC via Partial redundant hardware and GPU emulation (12.5% performance hit)

option 3) Hardware BC via Scaled/derivative hardware and GPU emulation (0% performance hit)

option 4) What's BC good for? Emulate what you can and resell the rest! ARM FTW!


*Of course, this is assuming the GPU can be reasonably emulated on both consoles.

As a consumer, which would you prefer?
 
Last edited by a moderator:
Or, if they can scale the existing architecture retrofitted for ps4/xb720, (cell x2, xenon x2) and emulate the existing GPUs, then the performance hit is 0%.

Or that could mean a 50% hit. Or more. You can no doubt scale old hardware, but how far, and what are you comparing it to? And what do you really mean by 'scale' - would Sandybridge count as a 'scaled' 8086?

Assuming there isn't a magical new CPU power out there that has a substantial lead on cell or xenon performance per die size.

If newer designs couldn't make better use of the then presumably there wouldn't be new designs, surely?

I'm not really sure what your poll is asking either - I can't see an actual question. Are you asking which is most likely?
 
Or that could mean a 50% hit. Or more.

I suppose, but I have yet to see such a jump in performance per die size. Especially over the already very efficient designs in Cell and Xenos.

Add OoOE? sure, assuming performance matches the cost in die size.

Switch to Arm? Sure, bring up some links which show a performance/die size advantage on a scale relevant to Cell/Xenos...

You can no doubt scale old hardware, but how far, and what are you comparing it to? And what do you really mean by 'scale' - would Sandybridge count as a 'scaled' 8086?

Scale meaning that architecture is derivative so as not to break compatibility with the existing designs. If the code worked for Cell/Xenos, it should also work for the next boxes.

If newer designs couldn't make better use of the then presumably there wouldn't be new designs, surely?

I'm sure there are a few new instructions which could be added to extend the functionality of the existing Cell/Xenos CPU's and perhaps even OoOE, but surely a complete architecture change wouldn't be needed to add such functionality to the existing designs.

I'm not really sure what your poll is asking either - I can't see an actual question. Are you asking which is most likely?

As a Consumer, which would you prefer? Assuming that die size (and trans count) is directly relevant to performance (which this gen has proven remarkably accurate).
 
I'd like full backwards compatibility.

It doesn't make sense since I rarely play my back catalog of games, but knowing it's there is strangely comforting.

I still own a modded Saturn (for Jap/US use), 2 DC (Jap and US), GameCube, XBOX in addition to hardware BC PS3 and an X360 Slim. I haven't turned own any of them besides my PS3 and X360 in the last year.
 
How about forget BC (or partial BC through emulation) and forget ARM as well. Just build the best box you can going forward, if you can get some BC out of that great, if not it's no big loss.
 
How about forget BC (or partial BC through emulation) and forget ARM as well. Just build the best box you can going forward, if you can get some BC out of that great, if not it's no big loss.

Then you would be voting for option 4. I just threw ARM in there as it seems to be the only logical alternative to IBM's Power architecture which would be compatible with Cell or Xenon. Intel isn't an option as they won't sell their IP which is necessary for Sony or MS to recoup costs late in life, and AMD honestly doesn't have a competitive architecture design to warrant switching.

Who knows, maybe the ol' Alpha chip could make a comeback! :smile:
 
Your options aren't realistic. There's no such things as zero cost for BC. Scalable hardware means an architecture that differs from what could be done from a clean slate design, which is always better to some degree, whether 1%, 5%, or 50% less efficiency. So of course people are going to vote for that hypothetical option, but it's a valueless piece of information. May as well add (which'll get 100% of the votes despite it'll never happen):

5) Full hardware BC via Full redundant hardware in addition to the rest of the system, with the console company taking a hit on profits for zero loss or extra cost for the consumer

You've also exluded the option of a BC 'adaptor' which is an important consideration if sincerely evaluating what the market desires.
 
I suppose, but I have yet to see such a jump in performance per die size. Especially over the already very efficient designs in Cell and Xenos.

CPU designs have evolved over the years as engineers learned lessons, as process technology advanced and as software changed. Why would Cell or Xenos (which wasn't even what MS really wanted) exist in a place were you couldn't do anything significantly better now?

I'm sure there are a few new instructions which could be added to extend the functionality of the existing Cell/Xenos CPU's and perhaps even OoOE, but surely a complete architecture change wouldn't be needed to add such functionality to the existing designs.

I don't know much about CPU architecture but I think you can make pretty radical changes to it while still maintaining a high degree of machine code level compatibility. x86 processors have changed quite drastically over time, and any new instructions you add need to be executed fast enough to be useful.

As a Consumer, which would you prefer? Assuming that die size (and trans count) is directly relevant to performance (which this gen has proven remarkably accurate).

Which of those options represents working out the cost (and impact) of software emulation supported by hardware only when absolutely necessary, and making a final decision once I know how much that will cost? I really want BC but not if PS360 end up doing a Wii.

Using hardware to do something that you know you will always be able to do fast enough in software just seems like an own goal. I don't know why anyone would want to replicate a whole CPU if you could could use a combination of an emulator, a wrapper, and if necessary support a few legacy instructions on the physical CPU. And I think it would be fair for MSony to consider charging a nominal one off fee for BC - particularly for physical purchases - to cover any IP licensing costs and fund ongoing maintenance of the emulator while new last gen content is still being developed.
 
I don't know much about CPU architecture but I think you can make pretty radical changes to it while still maintaining a high degree of machine code level compatibility. x86 processors have changed quite drastically over time, and any new instructions you add need to be executed fast enough to be useful.
x86 maintains compatibility by masses of legacy transistors as I understand it. This was raised on this board a few months back, and I remember finding a link where someone was saying BC in x86 processors was costing...something like 30% of the transistors. My memory fails me, but it is substantial and not just a few percent. If Intel could go with a clean processor design, their chips would be even better.

So BC of the architecture, scaled up as in option 3, is actually costing consumers a lot. It's necessary to run their old software, so we won't complain. If a new console doesn't have to run old software because it runs new software, or because it runs on a software layer way more flexible than the hardware limitations legacy x86 is tied to, than it's an unnecessary cost to bare.
 
x86 maintains compatibility by masses of legacy transistors as I understand it. This was raised on this board a few months back, and I remember finding a link where someone was saying BC in x86 processors was costing...something like 30% of the transistors. My memory fails me, but it is substantial and not just a few percent. If Intel could go with a clean processor design, their chips would be even better.

Yeah, I remember reading the link you posted and thinking that his numbers must be somewhat over-egged (he was from a VM company iirc). Looking at how small Atom and Bobcat are (Bobcat is < 5mm2 @40nm) I don't see how legacy support could possibly be demanding up to a third or so of a mainstream x86 processors die size or transistor count. A quick glance at some PC die shots gives of rough idea of where the silicon is going and it seems to be mostly interfaces, caches and the memory controller (and now, of course, the GPU too):

http://www.anandtech.com/show/5174/why-ivy-bridge-is-still-quad-core

So BC of the architecture, scaled up as in option 3, is actually costing consumers a lot. It's necessary to run their old software, so we won't complain.

Looking at something like Bobcat or Atom it doesn't feel like it should be a lot. It looks IMO like most of the die space on fast or server PC processors is going towards stuff that helps the CPU with everything that it does (including new stuff). It would be interesting to know who much you could take out of a PC CPU core if you were only running games written in the last, say, 10 years. Not that it'd help you make a sound guestimate about console BC, which is a very different situation, of course.

If a new console doesn't have to run old software because it runs new software, or because it runs on a software layer way more flexible than the hardware limitations legacy x86 is tied to, than it's an unnecessary cost to bare.

In a closed box where you can guarantee that the legacy software will run on your fine tuned software layer then absolutely, it's an unnecessary cost and you should cut it out. It not like a mains powered systems needs to worry about power efficiency, and as long as you can get close to the performance of the original hardware then the possibility that a hardware solution could do it faster would be utterly irrelevant.
 
...BC in x86 processors was costing...something like 30% of the transistors. My memory fails me, but it is substantial and not just a few percent. If Intel could go with a clean processor design, their chips would be even better.

So BC of the architecture, scaled up as in option 3, is actually costing consumers a lot.

That's x86. And you're not even bringing a real number to the table.

We are talking about Power RISC designs here. Not relatively bloated CISC.

Power1 2m logic transistors
Power3 15m transistors

Power4 174m trans
Power7 1.2billion trans

So assuming IBM literally had to stuff an old Power1 CPU in the Power7, it is taking up a whopping 0.17%.

That's worst case scenario as I don't believe they literally need a redundant old CPU sitting on die doing nothing other than waiting for legacy code.

But let's back this up a bit. If these designs were really so slow, bloated, and inefficient, why would Sony have invested hundreds of Millions of dollars in collaborating with IBM to incorporate these designs into Cell?

Why Would MS have come along and come up with nearly the same conclusion for their CPU design choice? Why even go With IBM at all? Was ARM (insert favorite alternate architecture here) not accepting new customers at the time?

Fact is, these designs were and are efficient. Could they use improvement? Sure. Does this improvement require an architecture change which would break compatibility with existing code-bases? I sincerely doubt it. But I'm all ears.

Show us.

Your options aren't realistic. There's no such things as zero cost for BC. Scalable hardware means an architecture that differs from what could be done from a clean slate design, which is always better to some degree, whether 1%, 5%, or 50% less efficiency. So of course people are going to vote for that hypothetical option, but it's a valueless piece of information.

Sure a different architecture could be chosen. But what and to what advantage? Sony tried this radical departure last gen with Cell. Striving for every last bit of performance they could possibly gain.

Result?

Not much better ... In theory it provides x performance, but in practice, it is astonishingly close to Xenon. For the most part because switching to an unknown, foreign, exotic, and at times, difficult design proved more trouble than it's worth for most.

Good thing for Sony is they have suitable tools now for developing on Cell. They have an architecture which was designed to scale beyond the current configuration. I suppose they could just throw it all away, but for what?

What is the magic sauce architecture which negates my 0% performance loss claim?

5) Full hardware BC via Full redundant hardware in addition to the rest of the system, with the console company taking a hit on profits for zero loss or extra cost for the consumer

What is this suppose to mean? Full Redundant hardware was option 1.

You've also exluded the option of a BC 'adaptor' which is an important consideration if sincerely evaluating what the market desires.

That's not a realistic option. At least not if MS/Sony are serious about creating a platform, not just a game device.
 
Last edited by a moderator:
Used BC on ps3 exactly once for gt4. No interest in BC. I can always play old games on old console and if I get lucky there will be remastered version with reasonable price + better graphics+load times.
 
So assuming IBM literally had to stuff an old Power1 CPU in the Power7, it is taking up a whopping 0.17%.

I think it's more the idea of incorporating something like a Power 6 rather than a Power 1.

But let's back this up a bit. If these designs were really so slow, bloated, and inefficient, why would Sony have invested hundreds of Millions of dollars in collaborating with IBM to incorporate these designs into Cell?

Sony didn't incorporate a Power(x) processor into Cell, they had IBM develop a new, lightweight CPU core that was also adapted into Xenon cores (as there wasn't time to make what MS really wanted, which was a more complex CPU core with OoOE iirc).

Fact is, these designs were and are efficient. Could they use improvement? Sure. Does this improvement require an architecture change which would break compatibility with existing code-bases? I sincerely doubt it. But I'm all ears.

Show us.

I'm not so sure you could run SPU code natively on a PowerPC ISA processor. If MS wanted a mostly off the shelf processor core from IBM I could image that causing problems. Just looking at Wikipedia entry for AltiVec:

Wikipedia said:
VMX128

IBM enhanced VMX for use in Xenon (Xbox 360) and called this enhancement VMX128. The enhancements comprise new routines targeted at gaming (accelerating 3D graphics and game physics)[2] and a total of 128 registers. VMX128 is not entirely compatible with VMX/Altivec, as a number of integer operations were removed to make space for the larger register file and additional application-specific operations

I guess that's what happens when you have to quickly adapt an existing design. Maybe BC would require register banks (for example) to be added that you wouldn't otherwise need, maybe supporting cache locking would require additional logic and routing that would otherwise not be needed etc etc leading to knock on effects elsewhere.

What is this suppose to mean? Full Redundant hardware was option 1.

You were talking about cutting 25% performance and adding full HW BC. Shifty was suggesting adding the option of cutting nothing, adding full HW BC and either the HW company eating all the costs or passing them on to the customer (the original Sony option). That's quite a different proposition to any of the poll options.

As long as the extra cost wasn't too high I'd be prepared to cover it, or at least some of it. Would be a less than great option for the console vendor, but for me it'd be fine. If they could do the same thing through a mostly or all software based solution and save us both money I'd prefer that though.
 
I think it's more the idea of incorporating something like a Power 6 rather than a Power 1.

And again, Does anyone here honestly believe that power7 isn't compatible with power6? And on the same train of thought, does anyone here honestly believe that because IBM used the same Power ISA, they are somehow hindered? If only they could just scrap the whole thing and start over...

Let's see ... how about Itanic? *ahem* I mean Itanium? ;)

Intel had a ton of resources to throw at a brand new architecture. Granted, it would have done better in a console environment (as long as the competition wasn't eating it's lunch) because devs would be forced to write to it and use whatever advantages the architecture afforded developers.

Much like what happens everyday with ps3 and xb360 and what will continue to happen nextgen.

Sony didn't incorporate a Power(x) processor into Cell, they had IBM develop a new, lightweight CPU core that was also adapted into Xenon cores (as there wasn't time to make what MS really wanted, which was a more complex CPU core with OoOE iirc).

The PPE is Power based. They took a Power4, and slightly modified it for Sony's desired spec. Still compatible with software written for the Power ISA.

I'm not so sure you could run SPU code natively on a PowerPC ISA processor. If MS wanted a mostly off the shelf processor core from IBM I could image that causing problems. Just looking at Wikipedia entry for AltiVec:

Indeed. The SPE's would have to be added to ps4. But with the advancements in code-base for Cell, I'm not seeing that as a hindrance.

I guess that's what happens when you have to quickly adapt an existing design. Maybe BC would require register banks (for example) to be added that you wouldn't otherwise need, maybe supporting cache locking would require additional logic and routing that would otherwise not be needed etc etc leading to knock on effects elsewhere.

I'm not disagreeing that there may be some things which may not be used as much in ps4 Cell as they were in ps3 Cell. But I seriously doubt that Sony has internal numbers which show that a signficant enough portion of Cell went un-utilized this gen for them to say: "This architecture is a waste so we need to dump it and run with X".

Now, they might have found that a number of cases could use OOOE, or a larger cache/store, or that future software would benefit from a few new functions to be added, but none of these examples require ridding the Console of the Cell/Power architecture.

None of them would fundamentally break BC.

You were talking about cutting 25% performance and adding full HW BC. Shifty was suggesting adding the option of cutting nothing, adding full HW BC and either the HW company eating all the costs or passing them on to the customer (the original Sony option). That's quite a different proposition to any of the poll options.

As long as the extra cost wasn't too high I'd be prepared to cover it, or at least some of it. Would be a less than great option for the console vendor, but for me it'd be fine. If they could do the same thing through a mostly or all software based solution and save us both money I'd prefer that though.

That's just it though. These companies DO have a budget. They demonstrated that budget to be roughly 500mm2 last gen. (438mm2 for MS, 470mm2 for Sony)

I don't expect either one of them to eat an additional cost and neither one of them want to be priced above the other nextgen.

Adding full redundant BC hardware WILL have an impact on what hardware goes in the nextgen boxes if Sony or MS choose to go that route.
 
That's just it though. These companies DO have a budget. They demonstrated that budget to be roughly 500mm2 last gen. (438mm2 for MS, 470mm2 for Sony)

I don't think it should be naturally assumed that 500 mm2 is the target die size for either MS or Sony. I for one never believed that Sony or MS planned to have Cell or Xenos/Xenon manufactuered at 90 nm. The transition from 130 to 90 to 65 nm seem to be rough across the board with mostly everyone dealing with either delays or leaky silicon during that time period.

Furthermore, given that 90nm based 360s and PS3s ate a giant hole in both MS's and Sony's pocket, I think they would be a little more conservative this time around. We going to be 7 or 8 years removed from the launch of last gen so I don't think MS or Sony needs to make as huge of a sacrifice when it comes to hardware manufacturing cost to see a similar jump in visual quality like last gen.

In terms of BC, I don't care.

Whatever kind we see if at all, its going to be cheap in terms of cost to the manufacturer. I doubt if the BC that existed in the PS3 ever paid for itself and I don't remember anyone arguing that 360 sales were hindered due to its partial BC with Xbox1 games.

We seem to have a lot of BC discussions when a new gen comes up, but I rarely hear BC brought up outside of BC discussions as an important factor for a console's success nor when we in the early middle of a generation where next gen transistioning is still going on.
 
And again, Does anyone here honestly believe that power7 isn't compatible with power6?

Who said anything about Power 6 and 7 compatibility? Your point was about the insignificance of including an entire older "Power" core.

And on the same train of thought, does anyone here honestly believe that because IBM used the same Power ISA, they are somehow hindered? If only they could just scrap the whole thing and start over...

Who was talking about scrapping an entire ISA?

The PPE is Power based. They took a Power4, and slightly modified it for Sony's desired spec. Still compatible with software written for the Power ISA.

Looking at Wikipedia, Power 4 looks very different to the PPE.

Indeed. The SPE's would have to be added to ps4. But with the advancements in code-base for Cell, I'm not seeing that as a hindrance.

If Sony want to have a number of identical CPU cores with conventional cache hierarchies and all using the same ISA then having to stick 6 or 7 SPUs on the CPU (or somehow completely integrate one into each "normal" CPU core) seems like it would be bothersome - and significantly above a 0% cost.

I'm not disagreeing that there may be some things which may not be used as much in ps4 Cell as they were in ps3 Cell. But I seriously doubt that Sony has internal numbers which show that a signficant enough portion of Cell went un-utilized this gen for them to say: "This architecture is a waste so we need to dump it and run with X".

Hopefully Sony are doing the rounds asking developers what they want going into the future, in which case they may well move away from Cell.

That's just it though. These companies DO have a budget. They demonstrated that budget to be roughly 500mm2 last gen. (438mm2 for MS, 470mm2 for Sony)

I don't expect either one of them to eat an additional cost and neither one of them want to be priced above the other nextgen.

Adding full redundant BC hardware WILL have an impact on what hardware goes in the nextgen boxes if Sony or MS choose to go that route.

Your poll is supposed to be asking people what they, as customers, want though. And a PS3 style option may be something that some people here want. And while it may not be likely, it's got to be at least as likely as including full hardware BC at no cost and with no drawbacks or impositions!
 
...given that 90nm based 360s and PS3s ate a giant hole in both MS's and Sony's pocket, I think they would be a little more conservative this time around...

Did it?

Or was it poor engineering on MS' Part which cause RRoD ($1B)

Bluray along with mandatory HDD, multiple memory traces for XDR and GDDR3, and further increased board complexity with having a ps2 chipset also sharing the motherboard I think caused Sony more Financial trouble than having a 470mm2 budget for Cell and RSX.

Could they be less aggressive this go round? Perhaps, but the decision to go with such a budget wouldn't break the bank. It's what they decide to do outside this silicone that may infringe on the ultimate budget size of the die.
 
Last edited by a moderator:
Who said anything about Power 6 and 7 compatibility? Your point was about the insignificance of including an entire older "Power" core.

The point was to show the ridiculousness of the worst case scenario for Power ISA BC. Shifty originally brought up x86 needing 30% of the core for x86 "compatibility" which is actually the decode/translate of the CPU which is completely irrelevant for the RISC based Power ISA.

That's the whole point of the ISA in the first place is to have a standard platform to write code to which will not be worthless the next time the chip company designs the new version of the CPU.

Who was talking about scrapping an entire ISA?

That's the 4th option. "Screw the ISA, develop whatever is necessary without any regard for legacy code."

That opens the door for ARM or ... ARM? I'm honestly not seeing a ton of other options out there but there seems to be a lot of desire to wipe the slate clean and go with ... something ... but it's never been said what the alternate architecture would/could be that would be worth scrapping Power for.

MIPS?
Alpha?
AMD is desperate these days, but it's due to the fact that they don't have much to offer that would be suitable from a performance stand point and Intel is king of performance and for that reason their products are in high demand so they don't need to sell their IP to Sony or MS for their new consoles ...

So that leaves ARM ... and Mips or Alpha ... or Sun's Niagra! :smile:



Looking at Wikipedia, Power 4 looks very different to the PPE.

...

If Sony want to have a number of identical CPU cores with conventional cache hierarchies and all using the same ISA then having to stick 6 or 7 SPUs on the CPU (or somehow completely integrate one into each "normal" CPU core) seems like it would be bothersome - and significantly above a 0% cost.

True.

But that's the choice Sony would have to make. Cell was designed from the outset to be scalable. If Sony chooses not to scale it and instead dump it for X architecture, there would be pros and cons to that decision.

It would be easier to dev for, but then so would a quad core PPE with 8SPEs there for those that know how to utilize them. For those that don't know or care to utilize the SPE's, they could invest in some middleware which puts them to use handling things like Audio, Physics, OS, and post processing effects.

All the while, providing BC, and for a relatively small die cost of around 40mm2 @ 28nm.

Or they could upgrade the SPE's to address their shortcomings and have them more useful for more tasks with improved middleware which takes advantage of them.

Hopefully Sony are doing the rounds asking developers what they want going into the future, in which case they may well move away from Cell.

They may, or they may upgrade Cell to reflect Developers issues with the chip.

Your poll is supposed to be asking people what they, as customers, want though. And a PS3 style option may be something that some people here want. And while it may not be likely, it's got to be at least as likely as including full hardware BC at no cost and with no drawbacks or impositions!

Want within reason. Perhaps I was a bit off in my assumption, but I'm pretty sure 500mm2 is about as much budget as we can expect next gen.

If we take that as a hard budget to work with, then there is only so much you can do with that space. Taking 120mm2 of it strictly for redundant BC is an option, just as it was for ps3, but that just seems wasteful IMO.

Especially when the CPU designs are perfectly scalable to a certain extent.

Further, the CPU will be playing a more limited role in the life of a future console anyway with GPGPU becoming more viable with products like GCN and Tesla. So I'm not seeing the point in dumping the existing architectures only to gain a few percentage points of "Performance" for the CPU when the CPU will not be the main driver of performance in nextgen consoles.

It'd be like dumping HDMI out for Thunderbolt in the nextgen consoles because Thunderbolt has 0.1ms lower latency and higher potential bandwidth* ... Most use cases won't be able to tell the difference in performance, but they sure will notice when it doesn't work with their previous purchased setup.

*fictitious example to prove the point

Chasing after a foreign architecture which breaks BC only to have a minimal performance gain in the CPU which will already only account for 20-25% of the die is just mindbogglingly short sighted IMO.
 
Last edited by a moderator:
That's x86. And you're not even bringing a real number to the table.

We are talking about Power RISC designs here. Not relatively bloated CISC.
As ever, you miss completely the point. Sure, XB720 could use a tripled up Xenos which would scale pretty efficiently, but that also assumes that such a CPU would be the best performance and value. What if MS can get an x86 or ARM or Cell CPU that would perform better and cost less? Then the choice of going with Xenos^3 costs the user relative to the option with an x86 processor. What if Sony go with Cell^2, loads more SPUs, but devs are sick of trying to wring performance from it and the middleware like UE doesn't work well on it? Then PS4 looks like poop next to XB3. Whereas if if Sony pick a more developer friendly CPU, the console gives the users a better experience.

And that ignores the issues of GPU emulation, even within the same family of processors, which you've never recognised because you seem to think that the hardware is accessed through a software layer and so graphics code is portable across GPUs. Emulating GPU adds cost somewhere along the line.

Of course we've had this debate before, and of course you never appreciated it then, so there's no point repeating this.
 
Back
Top