Asynchronous Compute : what are the benefits?

Discussion in 'Console Technology' started by onQ, Sep 19, 2013.

  1. onQ

    onQ
    Veteran

    Joined:
    Mar 4, 2010
    Messages:
    1,540
    Likes Received:
    56
    Asynchronous Compute seems to be the biggest customization Sony made to the PS4 GPU hardware, yet I haven't seen much talk about it.

    What do you think will be the biggest benefits of having a asynchronous compute architecture in a console?
     
  2. gurgi

    Regular

    Joined:
    Jul 7, 2003
    Messages:
    605
    Likes Received:
    1
    It's one of the things I'm most excited about. I'm really curious about Knack, and how much of the particle physics are utilized in gameplay. Right now it seems Knack's pieces collide with enemies when you do a special move, which is cool, but I wonder what else they have up their sleeve. Havok uses GPGPU for particles as well.


    I can't wait to see what other algorithms devs think of this generation. It's way more interesting to me than the prettier graphics I've been chasing since my voodoo banshee days lol.
     
  3. taisui

    Regular

    Joined:
    Aug 29, 2013
    Messages:
    674
    Likes Received:
    0
    they upped the ACEs from 2 to 8, this should help with compute task utilization of the CUs.
     
  4. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,797
    Location:
    Well within 3d
    I don't believe that is a customization. GCN wasn't introduced with Asynchronous Compute Engines years before the PS4 for no reason.
    Sony did drive certain optimizations for job dispatch and cache behavior when using coherent memory. None of the disclosures fundamentally change the nature of GPU compute, although they do seem to be targeted at reducing queueing delay at the front end and very serious overheads related to cache behavior at the other.


    The Jaguar cores are not computational monsters, and they can't utilize all that memory bandwidth.
    The advantage for loads that fit the CUs well, and possibly even some that don't (if the CPU section is already overburdened), is that a significant amount of peak computational ability is available with hopefully modest impact in a design that offers little other alternative.
     
  5. onQ

    onQ
    Veteran

    Joined:
    Mar 4, 2010
    Messages:
    1,540
    Likes Received:
    56
    The other GCN GPU's only have 2 ACE's for doing 4 compute jobs at a time PS4 GPU has been customized to have 8 ACE's for doing 64 compute jobs at a time.
     
  6. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,797
    Location:
    Well within 3d
    Asynchronous Compute isn't defined by queue count.
    The question is "is there compute, and can it run out of lockstep with the CPU?"

    Sony has advanced the concept by making the process of using it more streamlined, and if the APIs and tool chains are robust, potentially the other party besides Microsoft bringing the innovation of providing a software platform that can actually use the hardware to an AMD device.
     
  7. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    14,864
    Likes Received:
    7,143
    I think it's just the way you worded the title and the first post make it seem like asynchronous compute is unique to PS4, which it is not. I think you're both in agreement about what the customization is.
     
  8. Esrever

    Regular Newcomer

    Joined:
    Feb 6, 2013
    Messages:
    833
    Likes Received:
    627
    Is there a good use for 8 ACEs?
     
  9. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,797
    Location:
    Well within 3d
    It allows more compute jobs to be available at once for the compute front end to pick from.
    This means fewer cases where commands are backed up behind queue entries they aren't dependent on that just happen to be in the same queue.

    I suppose at 64 queues that Sony really hopes that there will be way more jobs running concurrently than is done at present.
     
  10. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    18,525
    Likes Received:
    2,263
    Location:
    Maastricht, The Netherlands
    Didn't they also mention a far more fine-grained priority system, as one of the customisations?
     
  11. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,797
    Location:
    Well within 3d
    It's been so long and Sony's messaging has relatively content-free lately that I can't remember.
    Of the big three customizations Cerny mentioned, the compute one did mention prioritization and arbitration in hardware.
    However, it wasn't clear if the prioritization scheme was something Sony asked for, or if it's something Sony's front-end customization simply relies on or exposes.
     
  12. Esrever

    Regular Newcomer

    Joined:
    Feb 6, 2013
    Messages:
    833
    Likes Received:
    627
    didn't bonaire already have upgraded ACEs compared to gcn?

    Would the ps4 gpu be based on sea island?
     
  13. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,797
    Location:
    Well within 3d
    The indications are the hardware has the capability, but no analysis of the released Bonaire cards mentions it being exposed.

    That might be one of the announcements coming up.
     
  14. onQ

    onQ
    Veteran

    Joined:
    Mar 4, 2010
    Messages:
    1,540
    Likes Received:
    56
    Basically what I'm asking is what are some of the benefits of being able to run lots of smaller compute jobs on a console?


    I think it should be good for things like A.I & Animation since it could break it down into more jobs but still have the computing power of the GPGPU parallel processing.
     
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,797
    Location:
    Well within 3d
    The benefits are there are more compute resources, which is ideally a pretty generic thing that means the chip can do more stuff.

    One of the more specific examples Sony has given is actually using compute to provide better culling ahead of the graphics pipeline in a manner similar to how the SPEs were sometimes used in the PS3.
    A fair amount of the GPU compute capability is there so that the platform doesn't regress massively relative to Cell.

    If a workload doesn't need tight synchronization with the CPU, has very high data parallelism, has low complexity,has good arithmetic density, has a coarse granularity that prevents divergence from ruining SIMD efficiency, and doesn't rely too heavily on straight-line speed, it's a good candidate for the GPU.
    If it's complex, relies on straight-line speed, doesn't thrash the cache, and fits narrow SIMD better, hopefully Jaguar isn't too embarassing.

    If it requires high straight-line FP speed and fits an in-order pipeline with a rather exotic local store, you better hope it's some kind of encoding or decoding thing that can be offloaded, because that's something Cell is good at.


    As for why there's not too much discussion on it, it's because aside from "more graphics", Sony hasn't really given a strong indication on how well its GPU compute scheme will work, or what it will wind up doing besides graphics.

    There's a hope that someday people might get around to implementing audio wave tracing or physics (probably fluid or non-rigid body physics). It's not fully described, the full range of tools Sony hopes to have someday for this doesn't exist, and most devs for the first wave of games are using their GPUs for graphics.
     
    #15 3dilettante, Sep 19, 2013
    Last edited by a moderator: Sep 19, 2013
  16. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,928
    Likes Received:
    3,672
    Location:
    Guess...
    Thanks 3dililettante for your as ever patient and insightful responses.

    Could you give us a view on how desktop cpus might hold up in the types of gpgpu tasks the PS4 will be running? You specifically mentioned culling as one benefit which as you say Cell was particularly quick at. Do you think desktop CPU's have caught in that regard yet is is the only response to GPGPU at present, more GPGPU?
     
  17. BoardBonobo

    BoardBonobo My hat is white(ish)!
    Veteran

    Joined:
    May 30, 2002
    Messages:
    3,559
    Likes Received:
    492
    Location:
    SurfMonkey's Cluster...
    Proper fluid dynamics are something I'd love to see done. No games have really got that right yet. I remember a screen-saver that came with the Radeon 8500 that actually started to make me feel sea sick after a while!
     
  18. taisui

    Regular

    Joined:
    Aug 29, 2013
    Messages:
    674
    Likes Received:
    0
    FWIW, GPGPU is a concept, it's not referring to a type of hardware.

    On both the console and the PC, the software is offloading floating point heavy computations to the from the CPU to the GPU, because it does computations faster for these types of operations.
     
  19. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,797
    Location:
    Well within 3d
    The rough recommendation of 4 CUs for compute would give Orbis 410 GFLOPs from the GPU and 102.4 GFLOPs from the Jaguar cores.
    A Sandy Bridge K processor could put out about half that total, all on the CPU.
    If we assume this is a gaming rig, there's a GPU that I'm not going to include--although a huge chunk of what is GPGPU for one is going to be doable enough on the other.

    For things that do very well on the GPU, Orbis could in theory do very well. The high peak FLOPs tends to be severely underutilized outside of the GPU-preferred subset, so I'd want evidence that Sony's tweaks have actually done enough to make GPU compute that much better than current APUs for things that aren't already a GPU strong point.

    Orbis has to fall back to the Jaguar cores for single-threaded or complex workloads, which a modern desktop quad core from Intel can curb stomp easily, possibly with performance to spare to beat the GPU in areas where GPUs typically face-plant.
    Since a gaming rig is very likely to have a discrete card, it's a lot of brute force to overcome no matter how elegant Sony's solution turns out to be.


    At this point, most things Cell was good at have been brute-forced by the evolution of desktop cores, especially if you count the very latest Intel chips.
    The SPE work was much more important for the PS3 because RSX needed the extra help.
    Modern GPUs are simply massively more powerful and capable of doing more on their own.

    There may be additional customizations that have enhanced this for the Orbis GPU, but a big chunk of the gains from using compute for graphics work is something inherent to having a modern GPU.
    The case where GPGPU can be taken more seriously for non-graphics work is the case that AMD, Sony, and Microsoft need to make.
    Falling back on a good CPU (and for a gaming rig, better silicon and several hundred extra Watts of power) has been the safe bet for years.
     
  20. Aeoniss

    Regular

    Joined:
    Mar 23, 2007
    Messages:
    557
    Likes Received:
    0
    Location:
    Nebraska
    Paints a rather grim picture for next gen performance..
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...