Technical Comparison Sony PS4 and Microsoft Xbox

Status
Not open for further replies.
Hrm. But were they placed more or less in the same place as the console GPU's are now, relative to their PC brethren?

The GPU in the PS4 is "mid range" (right?), excluding banana-tier enthusiast cards like the 7990 or Titan. Were RSX and Xenos in a similar position or were they closer to the "GTX680's of their day"?

Yeah you could say that they are mid range compared with what were top-of-the-line cards then. Of course high end cards today target >1080p resolutions or ridiculous levels of MSAA (pointless when so many deferred engines can't use MSAA) so whether it's worth spending the BOM on a better gpu is debatable. Especially as heat and cooling matter more under your tv than they do in a cavernous mid-tower case, hell any top tier gpu likely uses more power under load than these entire consoles will.
 
I don't think its a function of caches on the CPU. Much like it wouldn't be a function of caches on the CPU to fill the 8GB of data placed in RAM. The GPU can effectively fetch 8GB +32 MB simultaneously. The 8gb would utilize the 68 GB/s pathway while the ESRAM would get populated by some combination of the Directly through the NB pathway (132gb) and/or Main RAM.

The NB is not 132GB/s the NB is 30GB/s.

The direct link to the eSRAM from the GPU alone is the only bus that is 102GB/s.

And the eSRAM has a max write/read rate of 102GB/s therefore you cannot write to it at 132GB/s.

The only way for the GPU to read at higher then 102GB/s is to read from either then 5MB NB cache at 30GB/s and the eSRAM at 102GB/s or to read from the DDR3 at 68GB/s and the eSRAM at 102GB/s.
 
The NB is not 132GB/s the NB is 30GB/s.

The direct link to the eSRAM from the GPU alone is the only bus that is 102GB/s.

And the eSRAM has a max write/read rate of 102GB/s therefore you cannot write to it at 132GB/s.

The only way for the GPU to read at higher then 102GB/s is to read from either then 5MB NB cache at 30GB/s and the eSRAM at 102GB/s or to read from the DDR3 at 68GB/s and the eSRAM at 102GB/s.

The gpu memory controller system view is 68 to main ram, 102 to ESRAM, 30 to everything else simultaneously.

Whereas PS4 is 176gb everywhere except to the cpu.

I understand where you are coming from.
 
The gpu memory controller system view is 68 to main ram, 102 to ESRAM, 30 to everything else simultaneously.

Whereas PS4 is 176gb everywhere except to the cpu.

I understand where you are coming from.

Yes but those memory busses are not equal

the 30GB/s coherent link only has access to 5MB for gods sake.
 
I'm not assuming *anything*. I'm simply not dismissing what Nick Baker said during the panel and I have a host of supportive reasons that point towards the possibility of clock increases. Do you not understand what constitutes the difference between an assumption and a speculative conclusion?

And btw I don't disagree about that sort of thing bringing with it lower yields and machines running hotter, but additional cooling is a stretch. Not sure how much engineering experience you have but I have plenty and I can promise you that anytime engineers are working on a competitive project like this they leave wiggle room in the areas where they feel they can adjust later on if need be. There is already built in wiggle room for these machines thermodynamically. They already will have some give in that area. So I don't agree that just because clocks go up in my described scenario therefore we automatically have to totally retool the cooling system or even just more fans, etc. That's far from a given.

The difference between our understanding is that you are accusing the majority of too much assuming while at the same time clinging on to your loose conjecture as if it is the most plausible explanation of how Nick Baker came to his 200GB/s bandwidth figure.

I did watch architecture show and I have read your agruments, several times now.

(1066/800)*102.4=136.5GB/s
136.45+68.3=204.8GB/s total

A 33% clock increase would be beyond their thermal "wiggle room"...I assume.
 
How does the 8gb RAM get filled? It goes through the Northbridge too.

The DDR3 gets filled by the HDD or the program, but if you want to read the DDR3 you have to use the DDR3 bus, no getting around that, using the coherent bus or not. Theres no point using the coherent bus to read the DDR3 you may as well just use the DDR3 bus thats attached to the GPU which is twice as fast as the coherent bus.
 
The DDR3 gets filled by the HDD or the program, but if you want to read the DDR3 you have to use the DDR3 bus, no getting around that, using the coherent bus or not. Theres no point using the coherent bus to read the DDR3 you may as well just use the DDR3 bus thats attached to the GPU which is twice as fast as the coherent bus.

Bingo.

That is also how the ESRAM can be filled.

The ddr bus has 68gb to the nb to receive that data.

The ESRAM has to use its 102gb bus then the 30GB bus to receive that data. Its really no different depending upon how the developer wants to utilize the ESRAM.

The ESRAM can be copied to from DDR, cpu or hdd theoretically.
 
Bingo.

That is also how the ESRAM can be filled.

The ddr bus has 68gb to the nb to receive that data.

The ESRAM has to use its 102gb bus then the 30GB bus to receive that data. Its really no different depending upon how the developer wants to utilize the ESRAM.

The ESRAM can be copied to from DDR, cpu or hdd theoretically.

Except that the HDD could only realistically supply about ~20MB/frame.

Sure its doable but it is no where near equal, you cannot equate a bus that can at most copy ~25MB/frame to one that can copy GB's/frame.
 
I don't like those aggregate bandwidth figures, not that they are completely useless but in the context imo it is more about agenda /trollish matter than technical consideration aka whatever "aggregate bandwidth one comes with has to match the ps4 bandwidth to its main RAM".

If you consider aggregate bandwidth figures, I would just stick the on chip bandwidth and it would still be a bit of an apple to orange comparison.
The cpu have more bandwidth to their caches than to the main RAM, the Cus have bandwidth to LDS,GDS, L1 L2, the ROPS to the color and Z caches, etc. durango have on top of that access to a scratchpad (vs Liverpool).
The ps4 has more CUs more ROPs and more "aggregate" bandwidth to various memory pools.
How that is relevant vs the scratchpad in Durango, it is an apple to orange comparison /not straight forward comparison but some data I would think are not conpletely irrelevant either:
I would think about the amount of L2 (GPU) embark by both chips, the number of ROPs and the caches associated to them, etc.

Overall this is not an attempt at downplaying the benefit of the durango scratchpad or the other way around just an attempt to show that even for a non tech head like me speaking of aggregate bandwidth figures especially with the pretty obvious goal of making the point that system 1 has as much "aggregate" bandwidth as system 2 when on top of it systems 1 and 2 differ on other aspects too is only going to get people nowhere outside of feeding fan wars.
 
Last edited by a moderator:
It seems GCN 2.0 comes with 8 ACES like PS4 GPU:

http://forum.beyond3d.com/showpost.php?p=1741706&postcount=1302

Only two things remain to be considered custom made by AMD for Sony:

-volatile bit "flag" in L2. This as already said is something that possibly is already standard in vanilla GCN.
-Onion + bus that remains to be seen if is not also standard in Temash, Kabini or Kaveri.

So, nothing of "enhanced PC GPU" -GPU tech inside that belongs only to Sony- really ?.
 
It seems GCN 2.0 comes with 8 ACES like PS4 GPU:

http://forum.beyond3d.com/showpost.php?p=1741706&postcount=1302

Only two things remain to be considered custom made by AMD for Sony:

-volatile bit "flag" in L2. This as already said is something that possibly is already standard in vanilla GCN.
-Onion + bus that remains to be seen if is not also standard in Temash, Kabini or Kaveri.

So, nothing of "enhanced PC GPU" -GPU tech inside that belongs only to Sony- really ?.

I haven't seen the volatile bit flag stuff posted anywhere could you give a link to it?, it sounds like something from GCN but we cannot be sure.
 
The HDD can realistically supply 2MB per frame.

Cheers

What about using standard metrics, you know like... MiB/s ?
My H.D.D. sustains 100MiB/s reliably, so that's 1.67MiB in a 60th of a second, or 3.34MiB/s in a 30th of a second...

I lament the fact they aren't using SSD, although HDD are already way better than any optical drive.
 
I haven't seen the volatile bit flag stuff posted anywhere could you give a link to it?, it sounds like something from GCN but we cannot be sure.

http://forum.beyond3d.com/showpost.php?p=1729818&postcount=1467

What L2 cache tweak? The volatile tag exists in all GCN based GPUs sold since 16 months. It's a GCN feature that finally gets used .

And the GCN hardware is capable of handling large amounts of different shaders/kernels with or without direct dependencies within the rendering pipeline (the latter would be asynchronous compute stuff). Wavefronts can be assigned different priorities for instance (for example, all wavefronts of a certain shader/kernel could be assigned a higher priority or some asynchronous background tasks can be assigned a lower priority). What Sony will probably add is a possibility for the devs to exert some influence on the work distribution and prioritization. Currently, there is no such possibility on PC GPUs (one can only change the priority in shader code once it got scheduled to a CU [and only when writing the shader in the native ISA, there is no possibility to do it through some higher level API], one can't set the base priority assigned to a wavefront upon creation), it is handled by game profiles in the driver. But that doesn't necessitate hardware changes, it's an API and firmware issue.
http://forum.beyond3d.com/showpost.php?p=1729818&postcount=1467
 
What about using standard metrics, you know like... MiB/s ?

MB *is* a standard metric.

My H.D.D. sustains 100MiB/s reliably, so that's 1.67MiB in a 60th of a second, or 3.34MiB/s in a 30th of a second...

100MB/s in sequential transfer rate. What is it if you throw some seeks in ?

Also is your drive a single platter 5400 RPM drive, which is likely what is inside XB1/PS4.

Cheers
 
it's "interesting" to me sebbi calls out 200 gb/s xbone bw sourced from ms tech panel when presumably he could have just as easily sourced 170 gb/s from vgleaks...
Interesting?

I am just quoting the official Sony/Microsoft information (press release / architecture panel / interviews) instead of some random internet rumor/leak sites.

Link to the Microsoft tech panel video: http://www.youtube.com/watch?v=tLBVHZokt1Q.
For the technical details, I recommend watching Nick Baker 25:00 -> 26:30.

Quote (Baker):
"...get high capacity and high bandwidth, so with our memory architecture we are actually achieving all of that, we are actually getting more than 200 GB/s per second across the memory subsystem".
 
Interesting?

I am just quoting the official Sony/Microsoft information (press release / architecture panel / interviews) instead of some random internet rumor/leak sites.

Link to the Microsoft tech panel video: http://www.youtube.com/watch?v=tLBVHZokt1Q.
For the technical details, I recommend watching Nick Baker 25:00 -> 26:30.

Quote (Baker):
"...get high capacity and high bandwidth, so with our memory architecture we are actually achieving all of that, we are actually getting more than 200 GB/s per second across the memory subsystem".

Well, MS technical panel for some people here is as reliable as vgleaks or even less. Not sure you can even be confident in Cerny technical claims ( see my above posts ) or Mattrick ones ( always online because of cloud computing for infinity power my ass ).
 
Status
Not open for further replies.
Back
Top