Full BC on PS3 (proven via Jailbreak) *spawn

That flop count is based off of the GPUs... If you're gonna compare CPUs only you should use their GFLOP count.

Even then, that's only part of the equation but it's good enough. Trying to get one CPU to emulate another of the same design but only a small clockspeed difference is the definition of madness in video game programming. It's not gonna work at all.
 
That flop count is based off of the GPUs... If you're gonna compare CPUs only you should use their GFLOP count.

Even then, that's only part of the equation but it's good enough. Trying to get one CPU to emulate another of the same design but only a small clockspeed difference is the definition of madness in video game programming. It's not gonna work at all.
I wrote both :) games require both to function and different games have different bottlenecks. But yea that was more or less the point I was trying to get across. Edit nvm you are correct I pulled some wrong numbers from Wikipedia.

Goodness. It's hard to find. Anyway. I'm sure you guys can correct me on numbers but I'm willing to bet the ratios are good enough for decent software emulation.
 
Last edited:
If different hardware. eg. Slap in Polaris 10 based APU (PSNeo) In Xbox One II, run the same API, you need to emulate the ESRAM and custom hardware at a minimum.
 
@iroboto also:
PS2 8MB * 8 (2 multi taps with 4 memory card slots each) = 64MB
PS3 (up to) 2.000.000.000 MB
Storage: 1:31250000
Your sarcasm falls flat.
I'm pointing out obvious flaws in your argument. The fact is software emulation runs on PS3 because it's insanely more powerful than ps2. Your are not going to get that difference of power with this next iteration.
 
There is no 'fact' for the bandwidth delta between the EDRAM in the PS2 and the GDDR3 in the PS3.

Xbox 2 will use the same instruction sets, a compatible 'direct X version', the same X86 architecture, and so on.
There is no comparison with PS2 vs PS3; those are literally continents apart, while Xbox One and 2 both are located in the same street.

If Xbox 2 would use Power9 Architecture, with a next generation imagination GPU cluster setup, etc, then it's probable that it would need an order of magnitude of processing power more, to run Xbox One software. The way it looks now, only ESRAM looks to be a hindrance for backwards compatibility (notice I didn't write "emulation")
 
Whoever broght up cpu and gpu emulation into this only gave a sloppy glance at the first posts and couldn't wait to come spill his wisdom to us dumbasses. The whole thread is about esram. Read it before posting, seriously.
 
Whoever broght up cpu and gpu emulation into this only gave a sloppy glance at the first posts and couldn't wait to come spill his wisdom to us dumbasses. The whole thread is about esram. Read it before posting, seriously.
The original argument is whether MS needed esram in their next console for backwards compatibility of native Xbox One games that have been compiled for XBO. He states that PS2 has edram and PS3 had software emulation therefore no issues on MS side because Sony was able to do it. Heck Sony was able to fully emulate more than just edram.

When one talks to emulation which is mainly CPU driven how could we remove it from discussion ? Emulation is possible only because of its power difference. Sure we can have Ps4 emulate XBO if you want but thats like digging a subway tunnel with a spoon. And that's exactly what is being discussed here.

People are complaining about QB performance on w10. That's recompiled code for PC. But the task at hand is more monumental than that. What makes you think any PC could emulate XBO today? And if no PC could emulate XBO under what logic would this next console be able to in software?
 
They are talking about an Xbox 1.5 running bone's games natively, where the only part that would have to be emulated would be esram on gddr5 or some such.
 
Embedded higher bandwidth ram should not be as big of a hindrance as some of you think.

Ps2 was merely to show that is has been done before. And it will happen again.

People are mistakingly remembering Albert Penello his PR attempt to brainwash people to believe the ESRAM was special and had performance benefits over GDDR5.
He had even had hypothetical examples which, to this day, have never manifested.
 
ESRAM is in XBox One because Microsoft their engineers did not know 8GB GDDR5 would be possible for launch and massive volumes. Not because it has lower latency or some other PR speak
 
PS2 emulation on PS3 could not be done either. And yet here were are.
You're using one situation to try and prove another, entirely different situation. Reality doesn't work that way.

PS2's framebuffer was exceedingly dumb and limited. Xbone's on-chip SRAM is much more flexible (able to store CPU program code IIRC). Emulating it is not going to be as "straightforward" (which it might not actually have been) as with PS2's GS eDRAM.
 
I guess we will just have to see if Xbox 1.5 or 2 will be fully backwards compatible, without relying on a separate, embedded ram pool :)
 
1.5 won't be 'BC' because it'll be an iteration of the same console. We've discussed at length whether a substitute would require ESRAM and plenty of us think it might be workable without. So even if XB1.5 or 2 launches without ESRAM and still plays XB1 games, your argument (PS3 could do it with PS2 so XB1.5 can do it with XB1) won't be validated. Ultimately, the technical discussion is where the truth lies - not extrapolations of vaguely similar situations.
 
Okay then. Either way, this topic can be closed as we will never know how they actually achieved PS2 emulation
 
Okay then. Either way, this topic can be closed as we will never know how they actually achieved PS2 emulation
I wasn't sure if you and Milk were trolling me earlier so I just stopped responding, but I brought in both CPU and GPU numbers for a very real reason.

Let's go with your assumptions but I'll preface why you cannot vaguely use one situation and apply it to another. First off, emulation is not impossible, it can be emulated but all emulation comes with extremely high overhead.
There's quite a difference running virtualization and emulation and in this case with XBO you are going to run emulation if there is hardware missing.

Luckily as you pointed out everything else is the same so the emulation wrapper can be fairly light weight, we assume. But the wrapper will still have to exist as it will be looking for API calls that shuffle data into and out of and directly addressing memory addresses inside esram.

Once the Wrappers spots these calls it's going to have to tell the program to look for these specific memory addresses elsewhere and work from there.

So part 1. Increased CPU requirement, the system is now looking for additional API calls and searching to line up addresses and mapping it into a similar space. In this case we expect there to be a lot of esram calls as 90% of XBo bandwidth is there. It's also only 32MB large so we expect a lot of action since space is limited. There is also a small hurdle that the CPU doing this check and moving memory is going to be significantly slower than DMA engines moving memory from the larger pool to esram or etc. This time the CPU will move memory from one address and write it to another address (read and then a write, and we know this to be a heavily penalty on DDR bandwidth).

This is a large overhead but not as big as translating a completely different CPU.

Let's talk GPU. So the GPU is now going to be stalled up waiting for memory in large because the program normally relies on a much speedier response from esram. This time we have a CPU moving memory over the main bus and not over DMA bus. We don't get advantage of swizzling while the textures are being moved and so forth. But let's break down to what happens, the game shaders are written likely with the assumption that simultaneous read/write or just quick switching between the two without penalty. There's nothing the emulator can do about changing that so now we are going to have to suffer some delay of getting this information to proceed.

Okay no problem. So I've framed it and I'm sure you understand everything I've written. Now your response is well PS2/PS3.

Okay so this is where ratios of powers do matter. You only know the end result, which is that emulation works. You do NOT provide a breakdown of frame time data. As far as any of us know PS3 could be struggling with EDRAM emulation but it makes up much more with it once it comes to PROCESSING. The GPU is 30x faster than. PS2.

Without real world knowledge of which parts is slow and which parts is fast you have a vague argument. Emulation is a non issue but performance is.

So it's why I brought up both CPU and GPU requirements. This next Xbox will be iterative, it's not going to have such a massive performance delta to be able to make up the latency of waiting for data to work.
 
Okay so this is where ratios of powers do matter. You only know the end result, which is that emulation works. You do NOT provide a breakdown of frame time data. As far as any of us know PS3 could be struggling with EDRAM emulation but it makes up much more with it once it comes to PROCESSING. The GPU is 30x faster than. PS2.

This is a key point, I think. I'd also add that It's likely that RSX had certain hardware features that "automatically" worked to minimise BW used for the same operations, with things like z-compression and various caches that helped overcome the overall deficit in the headline BW.

I can't think of any normal feature of current GPUs that would naturally help overcome the issues you talked about with intercepting and emulating the operation of the esram.

Having a fast CPU might be a very big help though ...
 
Once the Wrappers spots these calls it's going to have to tell the program to look for these specific memory addresses elsewhere and work from there.

So part 1. Increased CPU requirement, the system is now looking for additional API calls and searching to line up addresses and mapping it into a similar space. In this case we expect there to be a lot of esram calls as 90% of XBo bandwidth is there. It's also only 32MB large so we expect a lot of action since space is limited. There is also a small hurdle that the CPU doing this check and moving memory is going to be significantly slower than DMA engines moving memory from the larger pool to esram or etc. This time the CPU will move memory from one address and write it to another address (read and then a write, and we know this to be a heavily penalty on DDR bandwidth).

This would only be necessary for memory moves that require a swizzle or other modification. Since you're already virtualizing the two memory pools within a single unified memory pool a straight "move" operation could be faked by simply changing a pointer. No bandwidth needed. So, the performance impact would depend on how often these move operations required the data to be modified by the move engines in flight.

Let's talk GPU. So the GPU is now going to be stalled up waiting for memory in large because the program normally relies on a much speedier response from esram. This time we have a CPU moving memory over the main bus and not over DMA bus. We don't get advantage of swizzling while the textures are being moved and so forth. But let's break down to what happens, the game shaders are written likely with the assumption that simultaneous read/write or just quick switching between the two without penalty. There's nothing the emulator can do about changing that so now we are going to have to suffer some delay of getting this information to proceed.

The maximum theoretical bandwidth of the XBOne's memory setup, with DDR3 at full blast and the ESRAM doing reading and writing at full efficiency is still only marginally higher than the leaked bandwidth number for the Neo's GDDR5. If even a relatively small percentage of the moves that the XBOne needs to do in and out of ESRAM are just straight moves, and are therefore able to be emulated without consuming bandwidth, it's bandwidth needs should be comfortably satisfied by a single pool of GDDR5 at that spec.
 
Last edited:
This would only be necessary for memory moves that require a swizzle or other modification. Since you're already virtualizing the two memory pools within a single unified memory pool a straight "move" operation could be faked by simply changing a pointer. No bandwidth needed. So, the performance impact would depend on how often these move operations required the data to be modified by the move engines in flight.



The maximum theoretical bandwidth of the XBOne's memory setup, with DDR3 at full blast and the ESRAM doing reading and writing at full efficiency is still only marginally higher than the leaked bandwidth number for the Neo's GDDR5. If even a relatively small percentage of the moves that the XBOne needs to do in and out of ESRAM are just straight moves, and therefore able to be emulated without consuming bandwidth, it's bandwidth needs should be comfortably satisfied by a single pool of GDDR5 at that spec.
Great response. For some reason it didn't occur to me that the pointers could just be redirected and you'd hold a translation table.

A first thought however The setup will still suffer some loss regardless. Data is likely to be setup serially for pre fetch. With this translation your likely going to be jumping all over memory. There still will be heavy penalties I think.
 
Back
Top