PlayStation 4 (codename Orbis) technical hardware investigation (news and rumours)

Status
Not open for further replies.
Measured in ns rather than cycles, those Jaguar figures look even higher, though perhaps cycles is a more relevant measure. Maybe a power saving, low frequency memory controller simply has to make that tradeoff?.
Both benchmarks have results in both ns and CPU cycles. Latency in cycles is what programmers care about.
And wow at those last-gen PPC figures. Even taking account of the clocks speed, those figures are high! And with no dynamic branch prediction and OoOE either! :)
And no data prefetcher. Do a manual prefetch 600 cycles before every data access or die :)
 
Assuming PS5 would again use AMD x86 cores inside a next-gen APU, would the x86 version of AMD's next-gen 2016 K12 architecture be likely?

K12 is said to support both ARM and x86.
 
Latency in cycles is what programmers care about.

And no data prefetcher. Do a manual prefetch 600 cycles before every data access or die :)
So it's actually possible to schedule your code several hundred cycles ahead...? How effective is that really, what's the general speedup that can be attained?
 
Yeah everybody tends to report peak numbers, I think the DF interviews with both XB1 and PS4 designers say that they expect lower 'real world' numbers in use than the peak numbers used on the reveal slides.

Edit: Peak numbers tend to be for 'perfect' access patterns that are only seen occasionally while running a program doing useful work. For example the Pentium 4 hyper-threading worked really well for media encoding with both hyper threaded cores working giving almost a 100 % boost over 1 'normal' core. In other more general workloads memory access patterns and so on caused avg perf to drop to 20-30% above 1 'normal' core or in the worst cases actually lower than one normal core.
 
Last edited by a moderator:
But the actual bandwidth is still 176GB, why would a slide restrict it to only 140GB?
176GB/s is the theoretical peak if you have a single infinitely-long, behaviorally simple access pattern, and no funny business occurring.

140GB/s is probably more of a "what we're reasonably able to use" number.
 
Edit: Peak numbers tend to be for 'perfect' access patterns that are only seen occasionally while running a program doing useful work.
Yeah, I get that. I guess I got used to CPUs/GPUs not reaching their maximum throughput but never considered memory to be part of the same club.

176GB/s is the theoretical peak if you have a single infinitely-long, behaviorally simple access pattern, and no funny business occurring.

140GB/s is probably more of a "what we're reasonably able to use" number.
Interesting, I guess that applies to desktop GPUs as well? or is it because the structure of the PS4 is prone to more "funny business" giving that it has both the CPU+GPU contending over one memory pool?
 
PS4's Achilles' heel

Yes, about the bandwidth contention, wouldn't be possible to optimize synchronous CPU and GPU accesses on the main memory bus in order ro reduce the total contention and increase the real processed bandwidth?

Like doing CPU low bandwidth tasks in the same time as GPU higher bandwidth tasks and the opposite?

Don't hesitate to tell me my idea is stupid or infeasible!
 
Yes, about the bandwidth contention, wouldn't be possible to optimize synchronous CPU and GPU accesses on the main memory bus in order ro reduce the total contention and increase the real processed bandwidth?

Like doing CPU low bandwidth tasks in the same time as GPU higher bandwidth tasks and the opposite?

Don't hesitate to tell me my idea is stupid or infeasible!

Memory access patterns are a huge part of getting the most out of a console but if I understand correctly you have to allow for interactivity causing bursts of memory access. If a computer as one task to do you can get much closer to peak but a game is composed of many systems each with their own memory access pattern. To allow for one or more suddenly needing a lot of I/O (say animation if a dozen enemies spawn in) you design with room for that without it suddenly slowing the whole system down. 140Gb/s isn't written in stone it's Sony saying 'target near this to start with and see how your code works out'
 
Not sure if it'll impact the PS4 all that much. Later in the generation when things are really starting to max out if this loss in bandwidth really becomes a problem I see developers looking to offload more calculations to the GPU. There are 8 ACE sitting around to be used, I'm sure there's some opportunities around there.

The only thing I might see slightly annoying for developers is when they are cutting it real tight, not having that guaranteed/not knowing the mix of bandwidth could cause some 'sighs'
 
Not sure if it'll impact the PS4 all that much. Later in the generation when things are really starting to max out if this loss in bandwidth really becomes a problem I see developers looking to offload more calculations to the GPU. There are 8 ACE sitting around to be used, I'm sure there's some opportunities around there.

The only thing I might see slightly annoying for developers is when they are cutting it real tight, not having that guaranteed/not knowing the mix of bandwidth could cause some 'sighs'
There's no loss in bandwidth. As with FLOP ratings developers understand you never get 100% efficient use of memory bandwidth. It's always been this way.
 
There's no loss in bandwidth. As with FLOP ratings developers understand you never get 100% efficient use of memory bandwidth. It's always been this way.


I mean the CPU/GPU non linear bandwidth contention loss as per the slide; if CPU was contending too much I would imagine offloading to GPU to not cause as much contention. But yes I agree about the ratings.
 
Not sure if it'll impact the PS4 all that much. Later in the generation when things are really starting to max out if this loss in bandwidth really becomes a problem I see developers looking to offload more calculations to the GPU. There are 8 ACE sitting around to be used, I'm sure there's some opportunities around there.

The only thing I might see slightly annoying for developers is when they are cutting it real tight, not having that guaranteed/not knowing the mix of bandwidth could cause some 'sighs'

There is no lost in bandwidth it's still 176GB/s the graph is is just pointing out that the more bandwidth the CPU is using the less the bandwidth the GPU will have because it's shared memory.
 
There is no lost in bandwidth it's still 176GB/s the graph is is just pointing out that the more bandwidth the CPU is using the less the bandwidth the GPU will have because it's shared memory.

No, it's showing the combined total bandwidth of the CPU+GPU decreases disproportionally as you use more CPU bandwidth.

Actually it says exactly that right on the slide, heh.

Repost of the slide since I just saw it posted on GAF anyway

PS4-GPU-Bandwidth-140-not-176.png
 
Status
Not open for further replies.
Back
Top