Next Generation Hardware Speculation with a Technical Spin [post E3 2019]

Discussion in 'Console Technology' started by DavidGraham, Jun 9, 2019.

Tags:
  1. mrcorbo

    mrcorbo Foo Fighter
    Veteran

    Joined:
    Dec 8, 2004
    Messages:
    3,578
    Likes Received:
    1,986
    This is a benefit to HFR that I never even considered.
     
    turkey likes this.
  2. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    7,898
    Likes Received:
    6,184
    We had this exact discussion in another thread, but good to see it verified by sebbbi here
     
    Pixel and milk like this.
  3. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    I think some of the leaks indicated there were some lower-level API commands for the Xbox One that allowed developers to tweak items like CU allocation patterns. It's possible other low-level settings like giving a certain number of CUs for one part of the workload or allowing the GPU to allocate as much as it can for a given shader type.
    It's possible there are synchronization points that were coded with certain assumptions about how many CUs could churn through the workload before reaching a barrier, or intermediate targets whose code assumed but did not enforce a certain number of simultaneous wavefronts or workgroups. Counters and CU masks might have had issues if their values are being driven higher than expected, or the hardware's doubling means there are bit masks or values that are ambiguous or incomplete with more CUs.
    Within a workgroup, this may not matter as much, although there may be instances where a CU ID could be accessible to code and might give an unexpected value on a double-width GPU.

    Higher clocks, absent downsides like power consumption and feeling memory latency more acutely, would be a more generally useful value to scale since additional clock benefits serial and parallel algorithms.
    Fixed-function elements would benefit, and the geometry portion of the pipeline is significantly less parallel than the pixel processing portion. The work the geometry portion does precedes the amplification of work items that becomes the pixel back-end, so it has fewer wavefronts to absorb latency and hiccups with the primitive stream or the FIFOs in the geometry processor translate into many more pixels whose launch is delayed.

    With primitive shaders, there's additional math and conditional evaluation inserted into the shaders, and while it may save wasted work later it's additional serial execution up-front. There may be some elements like workgroup processing mode, the faster spin-up, and narrower SIMD of wave32 that may help push individual workgroups through faster.
    Backwards compatibility might be another area, although the alleged clocks match existing hardware. It's not clear at this point if there are elements that need to be maintained for backwards compatibility that might be slower on Navi. Also unclear is whether there's some overhead from emulating elements that Navi dropped (certain branch types, skip instructions, shfits), even as it restored Sea Islands encodings for a wide swath of others.
     
  4. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,151
    Likes Received:
    5,085
    Yes, exactly as I'd mentioned in another thread. What you lose in spatial resolution you more than regain through temporal resolution. Since most games have motion of some sort in them (especially if the camera moves), temporal resolution is far more important than spatial resolution, IMO.

    I'm hurt as I'd mentioned this before. :p But it's definitely good to see someone actually making games talking publicly about it.

    This basically means that temporal reconstruction gets better and better the higher the frame rate is. Hence 30 Hz is pretty horrible for temporal reconstruction while 60 Hz is the bare minimum for acceptable quality, IMO. And as Sebbbi mentioned, 120 Hz would be a really good place to be for temporal reconstruction to really shine.

    Regards,
    SB
     
    Sonic, Prophecy2k, milk and 4 others like this.
  5. cheapchips

    Regular Newcomer

    Joined:
    Feb 23, 2013
    Messages:
    704
    Likes Received:
    427
    So for a 60Hz display would be you still render 120fps and have more data for reconstruction?
     
  6. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,996
    Likes Received:
    4,570
    I think if you simply joined two frames of a 120Hz flow into one for 60Hz you'd only blur things out, not bring any more detail.

    If you have a 120Hz panel then with 1440p + temporal you'll get 120Hz motion and 4K "perception".
    Otherwise if you have a regular 60Hz panel you're probably better off running at true 4K because the performance demands between 1440p120 and 4k60 might be similar.
     
    cheapchips and milk like this.
  7. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,151
    Likes Received:
    5,085
    I'm not sure at that point. That basically means that you accumulate 2 120 Hz "frames" - display the results, accumulate another 2 120 Hz "frames" - display the result, etc. I'm don't think that would be better than just rendering 2 temporally different 60 Hz frames. Temporally different in this case meaning that they aren't just sequentially different (I may not be using the correct terminology here. :)). Good results and effects with reconstruction usually require more than just 2 frames though.

    Regards,
    SB
     
  8. milk

    Veteran Regular

    Joined:
    Jun 6, 2012
    Messages:
    2,993
    Likes Received:
    2,561
    Accumulating 2 120hz frames into one 60hz one would create ghosting artifacts. Unless they do uneven timestepping between frames keeping them both within the bounds of a 60Hz render with reasonable shutter speed for it's moblur. In that case, 60Hz monitors would essetially get super sampled motion blur, which sounds so high fidelity my legs shake like I'm a highschool freshman girl being asked to prom by Jeff, a senior and team captain of the school's winning football team.
     
    Prophecy2k likes this.
  9. see colon

    see colon All Ham & No Potatos
    Veteran

    Joined:
    Oct 22, 2003
    Messages:
    1,442
    Likes Received:
    211
    Would that high fidelity motion blur offer more or less detail than your prom fantasy just did?
     
    Prophecy2k likes this.
  10. milk

    Veteran Regular

    Joined:
    Jun 6, 2012
    Messages:
    2,993
    Likes Received:
    2,561
    Definetly more. A 4k60 game will look way sharper than my night with Jeff, which would all seem like a blur. Time would feel to be passing both fast, and in slow motion simultaneously. it's difficult to describe. But on a game, that would be considered stuttery and bad for gameplay, so more detail.
     
    Prophecy2k and Silent_Buddha like this.
  11. MrFox

    MrFox Deludedly Fantastic
    Legend Veteran

    Joined:
    Jan 7, 2012
    Messages:
    5,446
    Likes Received:
    3,945
    https://semiengineering.com/dram-tradeoffs-speed-vs-energy/

    That's an interesting metric: GB/s per mm of die edge.

    Hbm2e = 60GB/s per mm
    Gddr6 = 10GB/s per mm
    Lpddr5 = 6GB/s per mm

    Assuming currently available speeds, which is 410GB/s per stack, gddr6 at 16gbps, lpddr5 at 6400Mbps, it gives an idea of how much edge space is consumed to fit a certain width of memory.

    2 stack = 14mm
    256bit gddr6 = 51mm
    384bit gddr6 = 77mm
    256bit lpddr5 = 34mm

    Basically, without interposers or fanout tricks, it's about 12 connections per mm.

    While gddr5 isn't mentionned, from the signals list it had about 20% fewer lines required per chip than gddr6. So this gen would be:
    256bit gddr5 = 41mm
    384bit gddr5 = 62mm

    There has to be enough space left for pcie channels and all the rest other than the memory.

    As an example, with a wild guess of 75% of the edge for memory and 25% for eveything else, and a chip of 360mm2 (19mm x 19mm, total edge space is 76mm), 384bit would not fit, 320bit would be borderline. It also becomes clear that having a split memory could only work using HBM for one of them.
     
    #651 MrFox, Aug 14, 2019
    Last edited: Aug 14, 2019
  12. Shortbread

    Shortbread Island Hopper
    Veteran

    Joined:
    Jul 1, 2013
    Messages:
    3,797
    Likes Received:
    1,904
    I believe it's simple motion interpolation. Most modern LED TVs are capable of motion smoothing (i.e., Motionflow, TruMotion, etc.) which can simulate higher refresh rates (i.e, 120Hz, 240Hz, etc) on 60Hz panel TVs by injecting prior (or future) frames into the overall picture motion. However, films captured in 24fps or 30fps can look off-putting with motion smoothing being engaged. Giving it a cheap production feel of watching a live soap opera, rather than a smooth film experience. And since most current generation game console titles are 30fps, they can look quite bad (i.e., ghosting, frame latency, input lag, etc.) with most motion smoothing methods. Hence, the reason why most TV manufacturers offer a "game mode" on disabling this feature.

    That being said, material filmed in 60fps or games rendering at 60fps can look quite good with motion smoothing on, giving the impression of a smoother experience (or faster framerate). AMD/Sony/MS could have integrated some type of motion interpolation logic similar to LED TVs or how Nvidia handles frame reconstruction (or AFR) with motion smoothness in SLI setups, but more so in a single GPU fashion towards reconstructing multiple frames. With some form of bespoke logic (far beyond Pro's ID buffer) towards aiding CBR with motion interpolation, rather than solely relying upon the GPU clocks/speeds during frame reconstruction, this 'new logic' could relieve the constant burden on the GPU on keeping consistent high framerates, even with CBR methods being applied.

    So in theory, this bespoke motion interpolation logic would aid CBR rendering of 2 x 1440p frames @60Hz and give the feel / look beyond 60fps (giving the impression of 120Hz gaming) without brute forcing such high framerates. Sooooooooooooooo... when Sony and MS are taunting 120fps gaming, I'm pretty sure their just using the same PR messaging TV manufacturers are using when describing their sets being 120Hz/240Hz, when in reality it's 60Hz panels using motion interpolation methods.
     
    #652 Shortbread, Aug 14, 2019
    Last edited: Aug 14, 2019
    pharma likes this.
  13. Proelite

    Regular

    Joined:
    Jul 3, 2006
    Messages:
    816
    Likes Received:
    98
    Location:
    Redmond
    Hmm i think you need to take in account of memory clocks.

    We also have hard figures for die edge for a GDDR6 phy controller.

    on a 360mm2 die you can fit two more controllers on the left side.

    [​IMG]
     
    #653 Proelite, Aug 14, 2019
    Last edited: Aug 14, 2019
  14. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    40,725
    Likes Received:
    11,196
    Location:
    Under my bridge
    Are you sure about that? Sounds wrong to me. Motion smoothing of lower-framerate content up to 60 fps is possible, but motion smoothing of 60fps material to a virtual 120 Hz can't be done, and all you can do is add more motion blur the represent movement during the 1/60th second interval.

    120 Hz downsampled to 60 Hz would add a small bit of motion blur that'd potentially aid smoothness although on really faster content you'll just get ghosting. And that's all you can do. You can only present a 60th timeslice on a 60 fps display. Any more temporal information than that will be blur/ghosting, simulating a shutter open for longer. Not worth rendering double pixels in that case. What you could do though is jitter the sampling and get 2x reconstruction info, so better AA and reconstructed hyper-resolution I guess, approaching 2x supersampling (at it's simplest, and better than 2x SSAA with better algorithms).
     
  15. Globalisateur

    Globalisateur Globby
    Veteran Regular

    Joined:
    Nov 6, 2013
    Messages:
    2,947
    Likes Received:
    1,668
    Location:
    France
    Why there is plenty of black stuff on the right of the die? Is there something that shouldn't be needed in a console ? Also consoles will additionally have an HDMI controller ?

    EDIT: There is already one HDMI controller
     
    #655 Globalisateur, Aug 14, 2019
    Last edited: Aug 14, 2019
  16. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,996
    Likes Received:
    4,570
    Die edge measurements don't say how much total PCB area is needed for each implementation though. GDDR6 is demanding with the trace lengths AFAIK, so the actual PCB area it takes is significantly larger than HBM or even LPDDR.
     
  17. MrFox

    MrFox Deludedly Fantastic
    Legend Veteran

    Joined:
    Jan 7, 2012
    Messages:
    5,446
    Likes Received:
    3,945
    Do we know there is such direct association between the width of the controllers circuitry and how they are routed to the edge?

    I was assuming 16gbps, but yeah the 10GB/s per mm figure might have been for 14gbps. So that would be 67mm for 384bit. Their comment were not meant to be exact since they were talking about 6 or 10 times more than hbm.
     
    #657 MrFox, Aug 14, 2019
    Last edited: Aug 14, 2019
  18. Shortbread

    Shortbread Island Hopper
    Veteran

    Joined:
    Jul 1, 2013
    Messages:
    3,797
    Likes Received:
    1,904
    IIRC, the motion interpolation methods used in LED TVs take whatever the captured framerate (i.e, 24, 30, 60, etc.) and add the prior or future frame on matching the 120Hz or 240Hz method. Although 60fps capture already matches the TV's native refresh-rate of 60Hz, it still can benefit from prior frames (more so) being introduced. As long as the TV can support 120 frames (120Hz) or 240 frames (240Hz), through it's motion interpolation logic, the original film/video framerate capture will recieve the same treatment (as long as it's under 120fps or 240fps).
     
  19. Globalisateur

    Globalisateur Globby
    Veteran Regular

    Joined:
    Nov 6, 2013
    Messages:
    2,947
    Likes Received:
    1,668
    Location:
    France
    On PS4 and Pro they use the bus to the southbridge to communicate with the ARM + DDR3 memory in order to alleviate stuff from the Jaguar. the problem is that that bus is slow and has high latency so only limited stuff can be done that way.

    On PS5 could they use the PCI-e 4.0 bus to the rumored DDR4 memory pool and make it fully usable by the CPU for the OS ? That could be a way to dedicate the precious GDDR6 memory only for the game.
     
    milk likes this.
  20. metacore

    Newcomer

    Joined:
    Sep 30, 2011
    Messages:
    108
    Likes Received:
    80


    Xbox One X runs Overwatch already at 60 fps at 4K.... 4x faster GPU + 8 core Zen 2 would easily achieve 120 fps...

    Hmmm has something just slipped ? :wink:
     
    Shortbread and BRiT like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...