Can AMD GPUs implement 'hardware' fixed function pipelines through firmware?

Discussion in 'Architecture and Products' started by onQ, Oct 18, 2013.

Tags:
  1. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    40,706
    Likes Received:
    11,155
    Location:
    Under my bridge
    In the case of audio, compute can only do that job with a low latency interface to the shaders. The HWS enables that, opening up functionality that otherwise was practically impossible, I think.

    Additionally, searching for whatever onQ meant by 'DPU' threw up this, where the posit is there being discrete DPU silicon in future GPUs. This is the very opposite, using compute on the existing GPU and just allowing low latency, more reliable access to compute resources.
     
    Heinrich04 likes this.
  2. mrcorbo

    mrcorbo Foo Fighter
    Veteran

    Joined:
    Dec 8, 2004
    Messages:
    3,578
    Likes Received:
    1,986
    This change wasn't made because the GPU was as efficient at these operations as the custom hardware, though. It was made because the efficiency gained by using custom silicon which was only useful for these tasks (and that just sat idle when it didn't have audio processing to do) wasn't worth the cost of licensing the IP and the die area it consumed once it was made possible to use the more general-purpose GPU cores to serve the same purpose.

    Also, "creating fixed-function pipelines" is not an accurate way to describe this. It's this characterization, more than anything else, that *everyone* has a problem with in this case. And, more generally, it's your unwillingness to ever acknowledge and rectify your own errors of understanding or explanation that cause all of the grief you get from other posters and the moderators. When you get this same type of reaction in multiple threads across multiple forums, a "sane" reaction is to realize you're probably doing something wrong somewhere. You not coming to this realization after all of this time is what makes you look "crazy", not your ideas.
     
    milk, AlBran, iroboto and 1 other person like this.
  3. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,719
    Likes Received:
    5,815
    Location:
    ಠ_ಠ
    RIP Cell & Larrabee.
     
  4. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,435
    Likes Received:
    263
    The slide you posted has nothing to do with Sony. If Sony's next console has a DSP does that disprove your theory? Overtime it's likely that things like TrueAudio get consumed by more general processors if they aren't used much and the business model doesn't support the silicon cost. A console is more likely to keep a dedicated processor because it can ensure it gets used often. For PS4 to switch to using compute for audio developers must come up with a technique that's not possible with the DSP or works better using compute.

    The same applies to graphics. Most developers are using the compute capabilities to augment the fixed function graphics pipeline, not to replace it. Dreams being the exception we've heard about. In the case of Dreams it doesn't mean compute is better than the fixed function pipe it just enables them to achieve their artistic vision. Of course this is the reason for compute pipes. To allow developers to do things fixed function hardware wasn't designed to do.
     
    Grall, RootKit, AlBran and 1 other person like this.
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    It might also be on a continuum of fixed-function to general-purpose. They are programmable, but I am not sure if they are architecturally permitted to access areas that are not themselves dedicated to scheduling and queuing. They would be lacking a fair amount of the resources and data types not needed for a small micro-controller, nor is it clear how they implement memory accesses and virtual memory support (they play a role managing virtual memory for the rest of the GPU, but do so in a way that might put them to the side of the process).
    Possibly, as a low-level detail, some of the other fixed-function blocks have varying levels of control loops implemented. I am speculating at this point, but how complex this is may be related to where GCN does not clock as high and burns so much more power than other architectures.

    The complexity of what they are doing is another factor. AMD indicated it is able to backport a significant fraction of what they are doing for priority queues back several generations, which indicates that these features have a major software development component that took a long time to get right.
    Whether the PS4 would get the same treatment is at the moment unclear. There are physical factors not formally exposed that can affect which products can be updated, such as the maximum size of the microcode store. However, the PS4 has a full complement of 8 ACEs, which may allow for a half-retrofit where one half of the compute front ends takes a microcode patch streamlined for the new mode, while the other half take the base functionality. I am trying to track down the specific context where I saw that mentioned, it takes some of the lower-end GCN implementations out of the running since with one engine they couldn't handle new features and base functionality. More recent versions of the front ends have a larger store.
     
  6. onQ

    onQ
    Veteran

    Joined:
    Mar 4, 2010
    Messages:
    1,540
    Likes Received:
    55

    Same thread I explained that a GPU is also a DPU .

     
  7. BRiT

    BRiT (╯°□°)╯
    Moderator Legend Alpha Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    12,498
    Likes Received:
    8,701
    Location:
    Cleveland
    Stop refusing to use industry standard terms and making up your own and then changing what they mean.
     
    Malo, Grall, Silent_Buddha and 2 others like this.
  8. onQ

    onQ
    Veteran

    Joined:
    Mar 4, 2010
    Messages:
    1,540
    Likes Received:
    55
    The thread was started with me asking about the PS4 & for it to happen on the PS4 is would be Sony's doing even if it's AMD ,Sony & the dev community that's coming up with the code that works well with the pipeline.
     
  9. mrcorbo

    mrcorbo Foo Fighter
    Veteran

    Joined:
    Dec 8, 2004
    Messages:
    3,578
    Likes Received:
    1,986
    The mental gymnastics here are actually kind of awe-inspiring. To be so committed to the idea that, "I can not be wrong. Ever." that you literally come up with new definitions for the words you used in prior statements in order to make the statements correct is on a whole other level.
     
    pMax, Lightman and milk like this.
  10. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    As a followup to my earlier post, I found the mention of the microcode store size limitation, in the context of porting the full front-end functionality for HSA back to older GCN versions.
    https://www.phoronix.com/forums/for...nn-rock-rocr-hcc-on-linux?p=849406#post849406

    Support for both HWS and the Architected Queuing Language (for HSA) could not be hosted on the same microcode engine and still allow that engine to support the standard command types.
    Kaveri was able to support AQL and HWS while still being able to support the command format used by graphics by dividing the newer functionality between its two microcode engines. The discrete GPUs do not have that workaround.
    The similarity Orbis has with Kaveri in ACE and queue count might mean Sony could bring this in with a similar split, although if AQL can be skipped then perhaps a split isn't needed.

    The Volcanic Islands architectures: Tonga and Fiji, do not require playing microcode Tetris to update the functionality, and include being able to context-switch long-running compute wavefronts. Whether that can be brought back is unclear, AMD's patents usually involve some kind of extra hardware to help with this. Polaris would apparently draw from VI.
     
  11. Grall

    Grall Invisible Member
    Legend

    Joined:
    Apr 14, 2002
    Messages:
    10,801
    Likes Received:
    2,172
    Location:
    La-la land
    For it to happen on the PS4 would require changing A: the nomenclature of what constitutes a "fixed-function pipeline", as you don't actually make these up using programmable logic, and B: also reality, as a fixed-function pipeline is hardwired at the design and manufacturing stage and has ALWAYS BEEN THAT WAY. (Barring of FPGAs, which are beyond the scope of this discussion.)

    Are you aware that you're talking to several actual games developers in this thread, hm? Your bizarre, self-invented nomenclature and explanations would be like me, a layperson, telling an actual rocket engineer how a rocket engine works.

    In other words: fucking ludicrous.
     
  12. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    582
    Likes Received:
    285
    What the hell should be a DPU speaking about computer?

    dementia praecox unit?

    joking...


    Seriously, cannot find any valid source.
     
  13. mrcorbo

    mrcorbo Foo Fighter
    Veteran

    Joined:
    Dec 8, 2004
    Messages:
    3,578
    Likes Received:
    1,986
    Dataplane Processing Unit. It's a marketing term Cadence came up with for their more advanced DSPs.
     
  14. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    582
    Likes Received:
    285
    So it's not a term but a brand. There is nothing more to see here.
     
  15. mrcorbo

    mrcorbo Foo Fighter
    Veteran

    Joined:
    Dec 8, 2004
    Messages:
    3,578
    Likes Received:
    1,986
    To be fair, GPU was also just a marketing term from Nvidia until the industry adopted it.
     
  16. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    DPU in this context is a combination of a set of processors and base hardware IP geared towards customization, and the service and toolsets for customizing, implementing, and building the software for them.
    To make the term more generic in today's SoC-heavy world would be to make it redundant with anyone that builds a GPU or their own CPU in a chip with any level of integration. The specific architecture and services the DPU offering provides are what distinguishes it, and that seems like more of a commercial rather than architectural distinction.
     
    Heinrich04 likes this.
  17. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    582
    Likes Received:
    285
    Yes, and AMD called their cards "VPU" (V=Visual). At lest GPU is better then "SIMD array optimized for graphics tasks".
    To be honest I am really annoyed from the fashion naming every single shade of everything with a proper single name... All this is becoming worst than seeing biologies try to do order in that Darwinian orgy called "kingdom of protista" (or whatever classification type it is named this week) .
     
    #77 Alessio1989, Jul 8, 2016
    Last edited: Jul 8, 2016
  18. milk

    Veteran Regular

    Joined:
    Jun 6, 2012
    Messages:
    2,986
    Likes Received:
    2,558
    I know that feeling. It's called Exagerated Specific Naming Fatigue, or ESNF for short.
     
    Kej, 3dcgi, Otto Dafe and 1 other person like this.
  19. BRiT

    BRiT (╯°□°)╯
    Moderator Legend Alpha Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    12,498
    Likes Received:
    8,701
    Location:
    Cleveland
    And sometimes it's tied into NIHS (not invented here syndrome).
     
    Heinrich04, Kej, milk and 2 others like this.
  20. bridgman

    Newcomer Subscriber

    Joined:
    Dec 1, 2007
    Messages:
    58
    Likes Received:
    102
    Location:
    Toronto-ish
    I don't believe we can practically bring context switching back to CI - as you say there is some specialized hardware involved. It's probably not impossible to come up with a set of compiler/toolchain hacks that would insert code into loops to check for a pre-emption request then run a combination of shader code and driver code to simulate what VI+ hardware does both coming off and going back onto the shader core, but it's really tough to see that as a good use of time.

    +1 for no more initialisms/acronyms... I have a tough enough time keeping up with what we have already
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...