DirectX 12: The future of it within the console gaming space (specifically the XB1)

Discussion in 'Console Technology' started by Shortbread, Mar 7, 2014.

  1. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    Yeah I left that out but like you said it depends on how many shaders you are using.

    What rays? The zbuffer handles occlusion and transparency in done back to front out of necessity.
     
    #1761 Infinisearch, Feb 2, 2016
    Last edited: Feb 2, 2016
  2. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,435
    Likes Received:
    263
    As Roderic said you still need to be aware of changing pipeline state. The new APIs reduce CPU overhead but there's still a hardware cost. Where you'll see the most benefit from a lot of draw calls is if limited state changes so the hardware doesn't stall. What each architecture can tolerate will vary. In any situation my opinion is draw calls shouldn't be too small because you want work to be amplified on the GPU side of the PCIE bus.
     
  3. randgris

    Newcomer

    Joined:
    Feb 1, 2016
    Messages:
    14
    Likes Received:
    10
    Why Xbox One has 48 Ops/Cycle on CPU and 768 Ops/Cycle on GPU?


     
  4. dogen

    Regular Newcomer

    Joined:
    Oct 27, 2014
    Messages:
    335
    Likes Received:
    259
    Jaguar does 8 flops/cycle per core and gcn is actually 2 flops/cycle iirc.
     
  5. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    40,685
    Likes Received:
    11,130
    Location:
    Under my bridge
    :shock: :shocked: :-| :runaway:
     
  6. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    You'd have to drill back into the discussion threads from earlier in the generation, but the breakdown for Jaguar is that each core has two integer pipes, two memory pipes, and two FP pipes behind its 2-wide front end.
    The scheduler can in peak scenarios issue an operation to all six pipes, and with 8 cores that is 48. The sustained throughput is clamped by the front end, which even in the absence of a vast range of hazards (misses, branches, dependences, not having a 2:2:2 mix) cannot provide 6 ops per cycle. It would generally only start to approach if a stall causes a buildup of ops in the scheduler and then the CPU will race to empty the backlog.

    The GPU is a case of 12 CUs with 4 16-wide SIMDs, or 12x64 = 768.
     
  7. dogen

    Regular Newcomer

    Joined:
    Oct 27, 2014
    Messages:
    335
    Likes Received:
    259
    Well, got the Jaguar numbers from the answer here.

    https://stackoverflow.com/questions...ndy-bridge-and-haswell-sse2-avx-avx2/15657772

    He says - "8 SP FLOPs/cycle: 8-wide AVX addition every other cycle + 8-wide AVX multiplication every other cycle"

    And for the xb1 it's 768 flops x ~850mhz = 652 GFlops x 2 per cycle = 1.3 TFlops, right?
     
  8. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    8 FLOPs over 8 cores is 64, which is not consistent with the the 48 op figure.
     
  9. turkey

    Regular Newcomer

    Joined:
    Oct 21, 2014
    Messages:
    739
    Likes Received:
    429
    Games initially only had 6 cores available so 48ops?
     
  10. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    That wouldn't change what the chip could physically perform, and there are services the system partition would perform for the benefit of the game section. It's been a while since I've looked at this, but my recollection was that this was while they were discussing the SoC as an 8-core CPU.

    If the reservation did exclude CPU capability, it could also be argued that the GPU's ops would need an asterisk thanks to the system time-slice it has to give up.
     
  11. Starx

    Regular Newcomer

    Joined:
    Sep 29, 2013
    Messages:
    294
    Likes Received:
    148
  12. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,715
    Likes Received:
    5,812
    Location:
    ಠ_ಠ
    Well, I guess we know what they use in their offices.
     
    shredenvain likes this.
  13. powdercore

    Newcomer

    Joined:
    Aug 6, 2014
    Messages:
    41
    Likes Received:
    29
    Is 768 ALUs the same as 768 ops/cycle? Does that mean the PS4 can do 1152 ops/cycle given that it has 1152 ALUs?
     
  14. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,544
    Likes Received:
    10
    Location:
    In the land of the drop bears
    Yes.
     
  15. dogen

    Regular Newcomer

    Joined:
    Oct 27, 2014
    Messages:
    335
    Likes Received:
    259
    Notice how I said per core for the CPU? I wasn't saying the whole XB1 gpu can only do 2 flops.
    I was going by each ALU. I assume each can handle a Multiply + Add in one cycle? I don't know the specifics.
    Anyway, maybe it's more accurate to say it can do 32 ops per vector unit or maybe 128 per compute unit, but I really think every knew what I was trying to say.
     
  16. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    40,685
    Likes Received:
    11,130
    Location:
    Under my bridge
    Yeah, I was just kidding. Clearly the whole of GCN can do more than 2 ops per cycle as wrriten. ;) Although I wasn't sure what you were suggesting is two ops. I'd count it as 128 ops/clock per CU. 128 x 12 x 850 = 1.3 TF. That's AMD's official line. Can work backwards to ops per ALU or VU if one wants.
     
  17. Antan

    Regular

    Joined:
    Jul 27, 2007
    Messages:
    912
    Likes Received:
    9
    Location:
    Middlesbrough
    Randgris have you been looking on Mr X media? ;) It would be wise to avoid like the plague.
     
  18. fehu

    Veteran Regular

    Joined:
    Nov 15, 2006
    Messages:
    1,459
    Likes Received:
    391
    Location:
    Somewhere over the ocean
    Is Quantum break indicative of the quality of the next new api wave of games, or is it a best case scenario and the same can't be attained for example in a racing game?
     
  19. function

    function None functional
    Legend Veteran

    Joined:
    Mar 27, 2003
    Messages:
    5,135
    Likes Received:
    2,248
    Location:
    Wrong thread
    Well ... that's not exactly proof of DX12 allowing console level optimisation CPU side. i5-4460 or FX-6300 minimum (FX-6300 being six cores and about twice as fast as X1 CPU).

    The recommended specs are laughable. Yes, we already know that faster PCs can run games better. Simply "recommending" the fastest PC parts you can name off the top off your head helps no-one. #lazyspecs.

    "Intel Core i7 4790, 4GHz or AMD equivalent"

    Brilliant. Well done. Thanks for those useful recommended specs. Windows store users will now know they need the AMD equivalent of the i7 4790.

    Perfect.

    There isn't one.
     
  20. Clukos

    Clukos Bloodborne 2 when?
    Veteran Newcomer

    Joined:
    Jun 25, 2014
    Messages:
    4,462
    Likes Received:
    3,793
    I don't know, maybe they haven't given much thought into those specs, wouldn't be the first game to do that.

    And to put it into context, Alan Wake (that ran on the previous iteration of the same engine) had a PC port made in-house by Remedy that ran almost flawlessly on most systems at release. Best to wait for the final product before we start calling them names.
     
    #1780 Clukos, Feb 15, 2016
    Last edited: Feb 15, 2016
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...