Digital Foundry Article Technical Discussion [2021]

Discussion in 'Console Technology' started by BRiT, Jan 1, 2021.

  1. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,018
    Likes Received:
    15,763
    Location:
    The North
    If you want to take full advantage of a GPU, you do a lot of embarrassingly parallel work. When we look at the compute shader queue, work is divided into threads per block. And each CU/SM can handle so many blocks. So more CUs = more blocks that can be issued at once. Typically on the compute side I work with nvidia, so they assign IIRC about 1000 threads per block per SM/CU. Each SM can handle a couple of blocks. Because of the way the threads are serialized and the shared memory between the CUDA cores, you can assign work that can share data in which you're obtaining extremely good utilization out of your ALU. Huge amounts really.

    So in this case, having more CUs is a much greater advantage than having high clock speed, because ultimately more work can be done in parallel, and latency is ultimately handled by the amount of thread switching a CU can do. The CUs ultimately all need to wait for memory to provide the next piece of work, so tearing through your compute jobs faster doesn't necessarily improve performance. Having a large number of cores that can hold a lot of threads for work processing can keep its saturation up while it waits for the next bit of memory to arrive is ideal considering how latent memory can be. High parallelism will thrive on maximum throughput, provided you've got the bandwidth to feed it. The more work you can give it, the more work that can be done in parallel and keep stalling to a minimum. Ultimately the unit of work per time is going to be higher on multicore processing if they are being fully realized, not to mention being significantly more energy efficient at it.

    tldr; the programmers don't need to account for scaling more CUs. The Bandwidth needs to scale with the number of CUs. Programmers need to ensure they are coding in a way that ensures those CUs are fed well. Be clever at synchronizing threads etc.
     
  2. see colon

    see colon All Ham & No Potatos
    Veteran

    Joined:
    Oct 22, 2003
    Messages:
    1,992
    Likes Received:
    1,071
    Once the Xbox One S was launched, it had like a ~15% clock advantage, though, right? This would have been on CPU and GPU. Considering many games run at 900p on Xbox one and 1080p on PS4, you would think rendering 40% more pixels with 40% more compute would produce a few edge cases where Xbox One S's CPU advantage or it's narrower and faster (clocked) GPU would pull ahead, assuming developers leveraged ESRAM to mitigate the bandwidth discrepancy.
     
    PSman1700, egoless and cwjs like this.
  3. snc

    snc
    Regular Newcomer

    Joined:
    Mar 6, 2013
    Messages:
    815
    Likes Received:
    568
    yeah it reasonable but cpu and i/o will still have impacts on fps and don't think it will change that much in future
     
  4. PSman1700

    Veteran Newcomer

    Joined:
    Mar 22, 2019
    Messages:
    4,546
    Likes Received:
    2,084
    On top of a much higher bandwith throughput, a somewhat faster clocked cpu and no conentions between cpu/gpu when things get hammered tight, which down the line, will happen.
    Everything basically went wide (r), or both. Be it rdna2 gpus, NV, xbox etc except sony.
    Guess that utilisation for wider gpus will be happening indeed, in special considering ray tracing (and reconstruction if we also start using cu's for that
     
  5. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,018
    Likes Received:
    15,763
    Location:
    The North
    imo, I/O shouldn't have any impact on fps unless it's tied to rendering and no one should be tying a 5GB/s bandwidth to rendering. I should be careful with the choice of words here, because someone will undoubtedly showcase a stutter or frame rate drops due to I/O involvement. But that is likely signs of other issues plaguing their texture streaming system or a complete lack of available memory such that the pools are so small that relying on I/O to offload/unload is the only plausible scenario left.

    As for CPU being the bottleneck for the GPU. This is also unlikely. The CPU may account for likely no more than 5% of the bottleneck in total render time over the course of a large benchmark. Most of the time it's significantly less, unless your goal is to maximize framerate into the high 100+ range.

    You're unlikely to be CPU bottlenecked if you're also approaching an I/O bottleneck. Since that would be admission of a memory bottleneck. And thus the largest footprint in memory are GPU related items, so you're back at a GPU dependency.
    To be CPU bottleneck, you need low resolution and high frame rates. Or such a hell of a complex world with hundreds upon thousands of little objects that are interactive and happening within the world at once. But once again, there is hardware for this; AVX, AVX2.

    Otherwise, there is likely very little possibility that you'll get a CPU bottleneck at higher resolution with high fidelity unless you're back at jaguar cores and those were very biased setups.
     
    #845 iroboto, Feb 14, 2021
    Last edited: Feb 14, 2021
  6. snc

    snc
    Regular Newcomer

    Joined:
    Mar 6, 2013
    Messages:
    815
    Likes Received:
    568
    there is difference between cpu bottlenecked and cpu has inpacts on fps(especially on minimum frames which are often analyze) ;) i/o shouldn't have much impact on ps4/xone era games but we are talking about potential future differences
     
  7. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,018
    Likes Received:
    15,763
    Location:
    The North
    To me they mean the same as something has to be the bottleneck to the frame rate; that some part of the pipeline must ultimately be responsible for the total frame time.
    But sure, I think I know what you're trying to get at.

    I think you're referring to dips/stuttering versus the actual bottleneck of the frame rate, identifying the potential for a raw 'burst' of CPU or IO that can impact the render time momentarily.

    I expect most of that type of bursty/stuttering behaviour to happen at the beginning of this generation and less so at the end. Mainly because the games are cross gen so they are designed around a slower CPU. They rely on the hardware brute forcing the frame rates and I/O to obtain high frames. I think as last gen falls off, the optimization around the CPU side of things will change dramatically so we shouldn't get that bursty like behaviour.
     
    PSman1700 likes this.
  8. AbsoluteBeginner

    Regular Newcomer

    Joined:
    Jun 13, 2019
    Messages:
    960
    Likes Received:
    1,301
    Xbox had 32 ROPs, PS4 had 64 ROPs, so that advantage in clock is not the same as this gen.
     
  9. scently

    Veteran Regular

    Joined:
    Jun 12, 2008
    Messages:
    1,083
    Likes Received:
    420
    Actually, X1S has 16 ROPs, PS4 has 32, X1X has 32, and the PS4Pro has 64 ROPs.
     
  10. AbsoluteBeginner

    Regular Newcomer

    Joined:
    Jun 13, 2019
    Messages:
    960
    Likes Received:
    1,301
    Ah you are right, I mixed it up. X1X had half but considerably higher clocks so it made up.
     
  11. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,018
    Likes Received:
    15,763
    Location:
    The North
    nah it never made it up with clocks ;) it had a truck load more bandwidth. ROPS are very easily bandwidth limited.
     
  12. scently

    Veteran Regular

    Joined:
    Jun 12, 2008
    Messages:
    1,083
    Likes Received:
    420
    Which is why I have been bemused by the speculation that PS5 is outperforming XSX because of the speed of the frontend even though, as far as I know, ROPs performance is still bound by available bandwidth. Whatever the case is with the XSX I doubt it has anything to do with its pixel/clk.
     
    Dural and PSman1700 like this.
  13. AbsoluteBeginner

    Regular Newcomer

    Joined:
    Jun 13, 2019
    Messages:
    960
    Likes Received:
    1,301
    Not quite, if it had clock speed of Pro and 10% more CUs you would still feel that.

    Pro was BW limited way before ROP limited anyway, but that is because they effectively doubled PS4 GPU and only bumped BW by 20%
     
  14. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    18,785
    Likes Received:
    21,087
    Doom Eternal Switch: The Making Of An 'Impossible' Port - id Software/Panic Button Interview
     
  15. ChuckeRearmed

    Regular Newcomer

    Joined:
    Nov 1, 2020
    Messages:
    374
    Likes Received:
    156
    Id Tech 7 seems to be very interesting engine. I wonder what it could be if it were as open as Unreal Engine. I presume Unreal Engine is better?
     
    PSman1700 likes this.
  16. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    18,785
    Likes Received:
    21,087
    Depends on what you're looking at/for. Support and flexibility may be better on UE because that's been the requirements from their business model for some time now.
     
  17. cheapchips

    Veteran Newcomer

    Joined:
    Feb 23, 2013
    Messages:
    1,795
    Likes Received:
    1,929
    Really fond of these DF talks to the Devs videos. Needs more shots of dev tools in action though. *


    * I don't know why I need this!
     
    iroboto and jlippo like this.
  18. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    2,047
    Likes Received:
    1,477
    Location:
    France
    I'm really impressed by how much they're culling/discarding on switch... And I wonder how much compute power is required to do that.
     
    iroboto, mr magoo and RagnarokFF like this.
  19. Remij

    Newcomer

    Joined:
    May 3, 2008
    Messages:
    231
    Likes Received:
    385
    Love the DF interview videos. You can just tell John and the others are happy when they get to do more content like this. Love seeing them connect more with folks in the industry.
     
    RagnarokFF, AzBat, Jay and 4 others like this.
  20. snc

    snc
    Regular Newcomer

    Joined:
    Mar 6, 2013
    Messages:
    815
    Likes Received:
    568
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...