I don't believe that is a customization. GCN wasn't introduced with Asynchronous Compute Engines years before the PS4 for no reason.Asynchronous Compute seems to be the biggest customization Sony made to the PS4 GPU hardware, yet I haven't seen much talk about it.
The Jaguar cores are not computational monsters, and they can't utilize all that memory bandwidth.What do you think will be the biggest benefits of having a asynchronous compute architecture in a console?
I don't believe that is a customization. GCN wasn't introduced with Asynchronous Compute Engines years before the PS4 for no reason.
Sony did drive certain optimizations for job dispatch and cache behavior when using coherent memory. None of the disclosures fundamentally change the nature of GPU compute, although they do seem to be targeted at reducing queueing delay at the front end and very serious overheads related to cache behavior at the other.
The Jaguar cores are not computational monsters, and they can't utilize all that memory bandwidth.
The advantage for loads that fit the CUs well, and possibly even some that don't (if the CPU section is already overburdened), is that a significant amount of peak computational ability is available with hopefully modest impact in a design that offers little other alternative.
The other GCN GPU's only have 2 ACE's for doing 4 compute jobs at a time PS4 GPU has been customized to have 8 ACE's for doing 64 compute jobs at a time.
Asynchronous Compute isn't defined by queue count.
The question is "is there compute, and can it run out of lockstep with the CPU?"
Sony has advanced the concept by making the process of using it more streamlined, and if the APIs and tool chains are robust, potentially the other party besides Microsoft bringing the innovation of providing a software platform that can actually use the hardware to an AMD device.
I think it's just the way you worded the title and the first post make it seem like asynchronous compute is unique to PS4, which it is not. I think you're both in agreement about what the customization is.
Thanks 3dililettante for your as ever patient and insightful responses.
Could you give us a view on how desktop cpus might hold up in the types of gpgpu tasks the PS4 will be running? You specifically mentioned culling as one benefit which as you say Cell was particularly quick at. Do you think desktop CPU's have caught in that regard yet is is the only response to GPGPU at present, more GPGPU?
The rough recommendation of 4 CUs for compute would give Orbis 410 GFLOPs from the GPU and 102.4 GFLOPs from the Jaguar cores.Could you give us a view on how desktop cpus might hold up in the types of gpgpu tasks the PS4 will be running?
At this point, most things Cell was good at have been brute-forced by the evolution of desktop cores, especially if you count the very latest Intel chips.You specifically mentioned culling as one benefit which as you say Cell was particularly quick at. Do you think desktop CPU's have caught in that regard yet is is the only response to GPGPU at present, more GPGPU?
The rough recommendation of 4 CUs for compute would give Orbis 410 GFLOPs from the GPU and 102.4 GFLOPs from the Jaguar cores.
A Sandy Bridge K processor could put out about half that total, all on the CPU.
If we assume this is a gaming rig, there's a GPU that I'm not going to include--although a huge chunk of what is GPGPU for one is going to be doable enough on the other.
For things that do very well on the GPU, Orbis could in theory do very well. The high peak FLOPs tends to be severely underutilized outside of the GPU-preferred subset, so I'd want evidence that Sony's tweaks have actually done enough to make GPU compute that much better than current APUs for things that aren't already a GPU strong point.
Orbis has to fall back to the Jaguar cores for single-threaded or complex workloads, which a modern desktop quad core from Intel can curb stomp easily, possibly with performance to spare to beat the GPU in areas where GPUs typically face-plant.
Since a gaming rig is very likely to have a discrete card, it's a lot of brute force to overcome no matter how elegant Sony's solution turns out to be.
At this point, most things Cell was good at have been brute-forced by the evolution of desktop cores, especially if you count the very latest Intel chips.
The SPE work was much more important for the PS3 because RSX needed the extra help.
Modern GPUs are simply massively more powerful and capable of doing more on their own.
There may be additional customizations that have enhanced this for the Orbis GPU, but a big chunk of the gains from using compute for graphics work is something inherent to having a modern GPU.
The case where GPGPU can be taken more seriously for non-graphics work is the case that AMD, Sony, and Microsoft need to make.
Falling back on a good CPU (and for a gaming rig, better silicon and several hundred extra Watts of power) has been the safe bet for years.