Why does Frostbite engine perform relatively better on PS4 than XB1? *spawn

The above Chart is incorrect! Xbox One eSRAM maximum theoretical speed is 204 GB/s. But that number is irrelevant since real life numbers stated by Microsoft point to real life values of something between 140-150 GB/s. Not much different from PS4!

AFAIK, it's 102GB/s full duplex. Meaning you can write data to the eSRAM at 102GB/s while reading data at 102GB/s at the same time, but you can never read data at over 102GB/s. This means latencies will be much better, but raw bandwidth isn't as large as the other GDDR5 implementation.

Furthermore, trying to correct an Anandtech article with an arsetechnica one won't go very well in 99 out of 100 times.
 
It's not quite full duplex. Besides the fact that getting dual-issue is apparently dependent on some rather onerous banking considerations, it was indicated that the interface will not dual-issue a write alongside a read every 8th cycle.
 
they said not quite full duplex as there is a write bubble so you can do both operations simultaneously almost all the time but not quite, which coupled with the upclock has lead to many different figures being banded about. 109 (clock speed 853 * 128) full read or write, I believe its 204GB/s as the final value according to the eurogamer interview as they said every 8th write is impacted.

Edit: Too slow on the typing it seems.
 
AFAIK, it's 102GB/s full duplex. Meaning you can write data to the eSRAM at 102GB/s while reading data at 102GB/s at the same time, but you can never read data at over 102GB/s. This means latencies will be much better, but raw bandwidth isn't as large as the other GDDR5 implementation.

Furthermore, trying to correct an Anandtech article with an arsetechnica one won't go very well in 99 out of 100 times.

Even 102 GB/s is the theoretical peak... but in real life scenarios ?
 
Even 102 GB/s is the theoretical peak... but in real life scenarios ?

Real life scenarios generally have reasons why they do not always draw theoretical peak bandwidth in the pure read case.
We do not have enough information about the internal workings of the implementation to know how much that can fall short, however, at least theoretically the ESRAM has far less reason to fall below peak (and in theory not fall as far) than GDDR5 would.
 
Even 102 GB/s is the theoretical peak... but in real life scenarios ?

Real life of placing values into memory and just reading the same things over and over to verify on hardware or real life such as what figure a game has recorded over a frame and extrapolated to a full second?
 
The ~145 GB/s figure was from a presentation with figures seen a real game, although we don't know what proportion of frame time those access rates were over. A synthetic test would probably get higher figures, but not be so interesting.

That leaves the ~50 GB/s main memory BW untouched btw, for compute, CPU, other buffers, whatever.

PS4 memory setup seems better in vast majority of cases, but it doesn't necessarily mean X1 has no areas of relative strength.
 
It's not quite full duplex. Besides the fact that getting dual-issue is apparently dependent on some rather onerous banking considerations, it was indicated that the interface will not dual-issue a write alongside a read every 8th cycle.

Seven-eights duplex, then :)
 
Performance is ROP bound and they can't go larger than 720p anyway due to their g-buffer format exceeding ESRAM size.

They use 16 byte per pixel gbuffer (http://www.frostbite.com/wp-content/uploads/2014/11/course_notes_moving_frostbite_to_pbr_v2.pdf) so it fits fine even in 1080, albeit with very little room for anything else.

ROP bound doesn't really mean anything on this hardware... probably you mean bandwidth bound. But there's no such thing as a single bound for a frame. If I had to guess they're vgpr bound on their important shaders just like everyone else this gen.

As for why they're 720: they find the image quality acceptable for their goals and there's not enough consumer pressure to get them to change their goals.
 
It's not quite full duplex. Besides the fact that getting dual-issue is apparently dependent on some rather onerous banking considerations, it was indicated that the interface will not dual-issue a write alongside a read every 8th cycle.
Yep, that's it, that's what they explained back then. Hence the maximum ~145GB/s number for operations that can read and write from the same location.

Probably not your typical scenario though. For the others operations the theoretical max bandwidth (similar to the 176GB/s PS4 number) is 109G/s, not 102G/s, don't forget the overclocking guys (800 -> 853). :yep2:
 
They use 16 byte per pixel gbuffer (http://www.frostbite.com/wp-content/uploads/2014/11/course_notes_moving_frostbite_to_pbr_v2.pdf) so it fits fine even in 1080, albeit with very little room for anything else.

From page 15, they show 4 GBuffer MRTs (16 bytes), but since they'll also need a depth buffer, that would put the total at 20 byets per pixel.
At any rate, that should be enough to fit in ESRAM at 900p but since it would leave very little space for other things (shadows, lit buffers etc) it will likely impact performance negatively, which I'm guessing is the main reason to not go there.
 
They use 16 byte per pixel gbuffer (http://www.frostbite.com/wp-content/uploads/2014/11/course_notes_moving_frostbite_to_pbr_v2.pdf) so it fits fine even in 1080, albeit with very little room for anything else.

ROP bound doesn't really mean anything on this hardware... probably you mean bandwidth bound. But there's no such thing as a single bound for a frame. If I had to guess they're vgpr bound on their important shaders just like everyone else this gen.

As for why they're 720: they find the image quality acceptable for their goals and there's not enough consumer pressure to get them to change their goals.
just curious, but does compiler matter for shader performance on the consoles?

Found SĂ©bastien's blog
https://seblagarde.wordpress.com/tag/atan/

He talks about the lack of vgpr's available in particular with certain math functions. Interesting, so the new lighting is basically killing the GPUs since you're forced to use the inverse functions. This is what he got from Playstation compiler 2.0
I'm not really sure what is considered bloat. any thoughts?
acos: 48 FR (40 FR, 2 QR), 2 DB, 12 VGPR
asin: 48 FR (40 FR, 2 QR), 2 DB, 1 scalar instruction, 12 VGPR
atan: 23 FR (19 FR, 1 QR), 2 scalar, 8 VGPR

– VGPR count are more important than instruction count
 
Last edited:
AFAIK, it's 102GB/s full duplex. Meaning you can write data to the eSRAM at 102GB/s while reading data at 102GB/s at the same time, but you can never read data at over 102GB/s. This means latencies will be much better, but raw bandwidth isn't as large as the other GDDR5 implementation.

Furthermore, trying to correct an Anandtech article with an arsetechnica one won't go very well in 99 out of 100 times.

Yes... You are correct. I admit talking about 204 GB/s is wrong because of that limit. Besides it would never be 204 but 192 GB because eSRAM cannot read and write on all clock cycles.

eSRAM is a strange beast
 
Yes... You are correct. I admit talking about 204 GB/s is wrong because of that limit. Besides it would never be 204 but 192 GB because eSRAM cannot read and write on all clock cycles.

eSRAM is a strange beast

My recollection from some of the early debate here is that XB1 should be capable of some very good particle effects which we might not see replicated on PS4, hopefully developers showcase some of the unique advantages of the hardware on first party exclusives soon.
 
Last edited:
My recollection from some of the early debate here is that XB1 should be capable of some very good particle effects which we might not see replicated on PS4, hopefully developers showcase some of the unique advantages of the hardware on first party exclusives soon.

Can you be more specific? We are talking about eSRAM, and bandwidth! Xbox One bandwidth is funny since the large memory pool is slow, and the very small one is fast. But not as fast as we may think. Average its the same as PS4, but with limits on both read and writes!

So, how come you say that? Besides, the use of GPGPU should shoud allow for less memory bandwidth usage, and PS4 has more GPGPU!

With the exception of the limits placed by the RAW power, I dont see any differences between the consoles, and I only see the Xbox internal memory and bandwidth fragmentation as an aditional problem!

Also, as many games have shown Xbox usually has aditional performance problems alpha effects!
 
Back
Top