Technical Comparison Sony PS4 and Microsoft Xbox

Laa-Yosh · May 28, 2013

The ESRAM will probably store (parts of) the G-buffer, the rendered frame's buffer, various temporary buffers for particles, volumetrics, post processing and so on; stuff that needs to be accessed all the time. The main RAM could store the textures, the shadow maps, and any additional buffers the renderer may need that don't fit into the ESRAM, or those that could benefit from parallel access (thereby increasing available bandwidth).
So in short I expect the ESRAM to store data that's generated by the GPU and sometimes the CPU, and not data that has to be read from anywhere (except when a buffer has to be moved between the memory pools - but those operations should be optimized to a minimum). If the texture caches are good enough then X1's main RAM should be able to provide enough bandwidth to keep the TU's fed and avoid stalls, so the ESRAM will not be needed for that.

What also has to be understood here is that today's renderers are far more complex, it's not just about rendering a frame into a single buffer with a single pass any more. The rendering process is composed of many different stages, some of which rely on data from previous stages and others can be run in parallel. Pixel formats, texture accesses and other attributes of the stages will change wildly and frequently, and so will the required memory space and traffic.

So the PS4 is a simple case because no matter what the renderer wants to do, it has access to one type of memory system and space. This is also why GG could have been so careless with the Shadow Fall demo's various buffers, it didn't matter that much*.
But the X1 renderers should be set up to make proper choices about where they puts different kinds of data, and this should be based on each individual task's access patterns. This would of course require more effort from the developer, but could also mean a more efficient utilization of the memory subsystem.

However, the point is that you can't just do a simple comparison the way you guys are trying to do right now, assuming that the same access pattern is used throughout the entire frame and all types of data are equal and have to be loaded from somewhere. The X1 system is complex enough on its own, but on top of it all the various rendering engines will be quite different from each other too - so it is just not possible to come to a single conclusion.

* you guys should maybe read the Shadow Fall presentation again to see just how much memory is spent on the various buffers, I think their - unoptimized - requirement was around 300MB or so. But not all of them have to be kept around throughout all the rendering...

blakjedi · May 28, 2013

Mianca said:
Looks like there'll be a hell of a lot of data-shuffling going on in XBOX One. I understand the "move engines" are supposed to kind of alleviate that problem. But still: Looking at the high concept, SONY's appraoch seems way simpler and, as a result, more reliable.

I can't even begin to imagine the headache involved in checking for possible bottlenecks when tons of data keep being shuffled around between [BD], HDD, RAM, and ESRAM.

Devs got good at EDRAM management over the last several generations. First with PS2 then x360 noe Xbox One. This is not some new, inefficient mechanism of asset management. The benefits of this approach have been quantified and visible for over a decade. Of course everyone wishes there were a faster pipe to unified memory. But in lieu of that if your memeroy management is solid there appears to be quite few things the ESRAM removes from the main RAM bandwidth equation.

Hell PC devs will have to deal with ESRAM usage with the advent of Haswell. This architecture paradigm isnt going away anytime soon.

Betanumerical · May 28, 2013

blakjedi said:
Devs got good at EDRAM management over the last several generations. First with PS2 then x360 noe Xbox One. This is not some new, inefficient mechanism of asset management. The benefits of this approach have been quantified and visible for over a decade. Of course everyone wishes there were a faster pipe to unified memory. But in lieu of that if your memeroy management is solid there appears to be quite few things the ESRAM removes from the main RAM bandwidth equation.

Hell PC devs will have to deal with ESRAM usage with the advent of Haswell. This architecture paradigm isnt going away anytime soon.

Haswell is a transparent cache not a programmer managed pool like the eSRAM.

The eSRAM is a trade off, and as such with trade offs it comes with benifits and drawbacks, the benifits obviously outway the draw backs though.

Laa-Yosh · May 28, 2013

Yeah, the ESRAM will require more development effort, just to get near the PS4's performance level. Although, don't expect PS4 development to be that much easier with no optimization at all

But in these days of licensed and/or publisher standardized engines (UE, Frostbite, Anvil or whatever, and so on) a lot of this optimization effort will be spread out over several titles sharing the same renderer. And the people responsible for these engines have all proven to be pretty damn good at it, and also willing to invest in the effort over the previous generation.
So I wouldn't worry that much about this if the X1 manages to keep up in sales with the PS4 in the western markets. Publishers probably won't like it too much, though

Tap In · May 28, 2013

Betanumerical said:
Haswell is a transparent cache not a programmer managed pool like the eSRAM.

The eSRAM is a trade off, and as such with trade offs it comes with benifits and drawbacks, the benifits obviously outway the draw backs though.

I thought Anand said he wasn't sure yet if it was a cache or managed pool? do we know yet for sure?

astrograd · May 28, 2013

Betanumerical said:
We dont know the CPU reservations between the two
Same for ram
And we dont know what HDD/disk media they are using

so they are better in the 3 things we dont know?

We don't know clock speeds, which are more important to performance of the chips than any other metric in existence, yet you were eager to push one over the other in all aspects that relied on those. All the sudden your attitude has changed. Interesting.

Btw, maybe instead of my clock speeds theory, maybe they moved to 256 bytes/cycle from the eSRAM's leaked 128? That would put the bandwidth at 204.8GB/s for the eSRAM, also making the figure from Baker make sense. Thoughts?

astrograd · May 28, 2013

Tap In said:
I thought Anand said he wasn't sure yet if it was a cache or managed pool? do we know yet for sure?

Almost positive MS said at the hardware panel that they suggested it wasn't. I'll see if I can the direct quote for you guys...

EDIT: "We really wanted to get the software out of the mode of managing caches and put in hardware coherency on a mass scale for the first time in the living room." -Nick Baker, MS

So he was talking about hardware coherency there actually.

3dilettante · May 28, 2013

astrograd said:
Btw, maybe instead of my clock speeds theory, maybe they moved to 256 bytes/cycle from the eSRAM's leaked 128? That would put the bandwidth at 204.8GB/s for the eSRAM, also making the figure from Baker make sense. Thoughts?

I'd add the DDR3 and the coherent bus, and then say my console had over 300 GB/s.

Betanumerical · May 28, 2013

astrograd said:
We don't know clock speeds, which are more important to performance of the chips than any other metric in existence, yet you were eager to push one over the other in all aspects that relied on those. All the sudden your attitude has changed. Interesting.

Btw, maybe instead of my clock speeds theory, maybe they moved to 256 bytes/cycle from the eSRAM's leaked 128? That would put the bandwidth at 204.8GB/s for the eSRAM, also making the figure from Baker make sense. Thoughts?

The clock speeds are pretty obvious imo, everything has been consistent with vgleaks and they mention clockspeeds as well, these are huge APU's its not like they are going to clock to 2ghz CPU and 1ghz GPU

.

If they had better clock speeds they would have mentioned it, so far they have gotten away with mentioning as little as possible so they dont look too bad.

The eSRAM bandwidth is identical the ROP fill rate thats no concidence so unless they suddenly doubled there ROPs i dont see it happening.

Also, I personally think if they had anything over Sony (memory speeds, clock speeds, anything) they would have mentioned it at the architecture panel instead of vaugely avoiding mentioning anything of substance.

They are coy about specs, they werent last time, this is for a reason.

XpiderMX · May 28, 2013

Is there a link where officially is said that the CPU is jaguar based? I know Jaguar is the most obvious option, but I want to know if it is a plain Jaguar.

Betanumerical · May 28, 2013

XpiderMX said:
Is there a link where officially is said that the CPU is jaguar based? I know Jaguar is the most obvious option, but I want to know if it is a plain Jaguar.

Nothing is mentioned and I wouldn't expect them to go into details any time soon, they don't want to look bad in comparasion.

All we have to go is the careful snippets they say, such as the 768 (threads?) which reavaled 12CU but other then that...

XpiderMX · May 28, 2013

Betanumerical said:
Nothing is mentioned and I wouldn't expect them to go into details any time soon, they don't want to look bad in comparasion.

All we have to go is the careful snippets they say, such as the 768 (threads?) which reavaled 12CU but other then that...

But the CPU is weaker than PS4 too?

3dilettante · May 28, 2013

I don't see any signs of significant differences in the CPUs.
However, some of the numbers for the bandwidth to and from the CPU clusters do look better for Durango, going by Vgleaks. However, the data is kind of ambiguous and patchy.

Betanumerical · May 28, 2013

XpiderMX said:
But the CPU is weaker than PS4 too?

Its not so they might release just CPU information, just don't expect a lot of detail like Sony gave.

astrograd · May 28, 2013

Boglin said:
The difference between our understanding is that you are accusing the majority of too much assuming while at the same time clinging on to your loose conjecture as if it is the most plausible explanation of how Nick Baker came to his 200GB/s bandwidth figure.

I'm not *accusing*, I am stating a fact. My hypothesis doesn't assume *anything*. That doesn't mean my conjecture is right but I do feel at the very least that if the clocks were to be different than ppl assume then the comparisons in this thread become totally useless. As of now, those comparisons depend sensitively on assuming that nothing at all changed from the leaks and that Baker is trying to mislead ppl.

And there is a big difference between laying out an argument with corroborating evidence as I have and asserting it's "the most plausible explanation". We aren't privy to the correct explanation at this point in time. The fact ppl are emotionally resistant to even discussing the topic speaks volumes as to how ppl want to cling to their own views on the leaks instead of being open minded enough to consider the possibility that perhaps the actual finished product from MS may be slightly different. That kind of attitude isn't conducive to much of anything.

A 33% clock increase would be beyond their thermal "wiggle room"...I assume.

So long as you are stating your assumptions and not emotionally resistant to considering other possibilities I'm happy.

I'm personally not so sure on the thermal 'wiggle room' either. The only arguments I've seen to that effect thus far is that AMD's other major APU's are still a little bit behind that (not a ton btw). My issue with this logic is that those APU's have their thermodynamic considerations designed for form factors that are a ~1.5cm thick (tablets and laptops). X1 is something like 6 times as thick.

Betanumerical · May 28, 2013

astrograd said:
I'm personally not so sure on the thermal 'wiggle room' either. The only arguments I've seen to that effect thus far is that AMD's other major APU's are still a little bit behind that (not a ton btw). My issue with this logic is that those APU's have their thermodynamic considerations designed for form factors that are a ~1.5cm thick (tablets and laptops). X1 is something like 6 times as thick.

AMD's other APU's have half the cores (a max of 4) and also less then half the CU's (around 4 usually).

Rangers · May 28, 2013

major overclock seems unlikely, but it is suspicious we just got "the durango dev kits are overheating like crazy!" rumors recently.

Probably wrong thread, but i just dont get MS on this. Did they not realize how being spec deficient puts you behind the 8 ball from day one? For example, having to be vague as hell when talking about the box? That's already a reality. Didn't they get it? It's weird because they seem to have "cloud cloud cloud" already lined up. Why give yourself the need to scramble? So stupid.

AMD's other APU's have half the cores (a max of 4) and also less then half the CU's (around 4 usually).

Also run at max 600 mhz GPU IIRC, so already probably irrelevant comparison.

Also I find "Durango artch is super complicated" a little funny. Last gen 360 with a similar setup was considered the epitome of easy.

It's not like Durango has split main ram pools or a bunch of tiny coprocessors ala Cell something. It's unified with a dash of cache, same as "super easy" 360 last gen (yes I know there are differences, but basically).

Yes PS4 is the ultimate easy setup.

astrograd · May 28, 2013

Betanumerical said:
The clock speeds are pretty obvious imo, everything has been consistent with vgleaks and they mention clockspeeds as well, these are huge APU's its not like they are going to clock to 2ghz CPU and 1ghz GPU .

It's not necessarily too late for them to up the clock, hence heating issues, yield issues, killing the subsidized model, the comment by Baker, the lack of info specifically on this area at the reveal, and the VGLeaks claim of adjusted specs. As I told the other person, it may not be sufficient to assume AMD's other APU's play by the same limitations as X1 in terms of the thermodynamics. X1 is ~6 times as thick as the typical form factors AMD's other APU's are designed for and seems to have much better ventilation.

If they had better clock speeds they would have mentioned it, so far they have gotten away with mentioning as little as possible so they dont look too bad.

If they have equal clock speeds there is good reason to believe they'd have mentioned it too. :???:

The eSRAM bandwidth is identical the ROP fill rate thats no concidence so unless they suddenly doubled there ROPs i dont see it happening.

Fair point. Maybe they did up the ROPs? Does the eSRAM have to be used for that purpose though?

Also, I personally think if they had anything over Sony (memory speeds, clock speeds, anything) they would have mentioned it at the architecture panel instead of vaugely avoiding mentioning anything of substance.

They didn't mention the bandwidth figure vaguely though. Nothing Baker said that entire panel was particularly flowery PR speak. He was rather detailed in what he said. As I pointed out, the bandwidth figure came off the heels of him giving us 768ops/cycle in the same breath. Do we dismiss that figure too? I don't like just tossing information aside and dismissing it just because it doesn't fit in with prior assumptions. A better approach is to try and consider possible explanations that can accommodate all the info. It may be totally wrong...but it's logically more stable that way until more info cam accumulate.

I understand that making the lone assumption that Baker was lying or misleading ppl on purpose can help you fit all the info we have into a nice, tight picture. Only making one assumption like that normally isn't problematic. My issue is that the *entire* discussion that this thread is premised upon and the conclusions therein are extremely sensitive to this particular assumption more than any other area.

Hence, I think there is utility in rethinking these assumptions. Look, I make a living reconsidering other ppl's assumptions on topics much more complicated than this one and can tell you without the slightest hesitation that it's very often an extremely powerful approach.

astrograd · May 28, 2013

Betanumerical said:
AMD's other APU's have half the cores (a max of 4) and also less then half the CU's (around 4 usually).

Ok...but again, X1 has 6 times the space to accommodate a hot APU with much better ventilation, so ultimately the thermal considerations and limitations are going to be notably different. Has MS given any specific numbers about power draw yet? I recall them talking about the low-power angle a fair amount in the panel. Maybe that can help.

Rangers · May 28, 2013

I would put a Xbpone overclock at fairly reasonable after the 200Gb/s thing, but this is bringing me down ever since it was posted

http://www.neogaf.com/forum/showpost.php?p=58765345&postcount=117

Supposed insider, I've noticed him being insidery/accurate before.

But I also seem to recall he doesn't directly make games too, so there is a little wiggle room there. Updated kits may not have filtered to his sphere of knowledge yet if they exist.

and this is his other recent x-bonery post

http://www.neogaf.com/forum/showthread.php?p=58517553&highlight=#post58517553

Technical Comparison Sony PS4 and Microsoft Xbox

Laa-Yosh

I can has custom title?

blakjedi

Betanumerical

Laa-Yosh

I can has custom title?

Tap In

astrograd

astrograd

3dilettante

Betanumerical

XpiderMX

Betanumerical

XpiderMX

3dilettante

Betanumerical

astrograd

Betanumerical

Rangers

astrograd

astrograd

Rangers

Similar threads