ID buffer and DR FP16

Discussion in 'Console Technology' started by Jamend, Jun 15, 2017.

  1. shredenvain

    Regular

    Joined:
    Sep 12, 2013
    Messages:
    921
    Likes Received:
    189
    Location:
    Somewhere in southern U.S.
    In my opinion we will have to wait for 3rd party titles like Anthem and Metro to release to see the true advantages and disadvantages of both platforms.

    I for one think using the XoneX power to push beyond just high pixel count is the way to go. Anthem is using checkerboard of some sort to hit 4k. I think the reason for this may have something to do with the impressive level of detail and high end "PC like" settings on display in the trailer.

    It comes down to developers using the extra memory bandwidth and shader power to render a title with Xbox one settings at native 4k or using this extra power to render PC ultra settings in a title that renders with checker boarding or dynamic res.

    I would prefer the latter each and every time!

    PS I don't see devs using checkerboard rendering on the Xbox one x unless it provides a substantial benefit in freeing up memory and GPU usage. So without first hand experience it doesn't seem like the shader and memory cost of checker boarding would be enough to put the Pro on par with the OneX.
     
    #41 shredenvain, Jun 16, 2017
    Last edited: Jun 16, 2017
    BRiT likes this.
  2. shredenvain

    Regular

    Joined:
    Sep 12, 2013
    Messages:
    921
    Likes Received:
    189
    Location:
    Somewhere in southern U.S.
    Actually in the video piece for this article DF state that the use of checker boarding and dynamic res in frostbite comes from a GDC talk about the engines possible features on project Scorpio.

    They go on to state in the video that after pooring through the video feed they didn't see one frame drop below 2160p.
     
    RootKit and BRiT like this.
  3. Globalisateur

    Globalisateur Globby
    Veteran Regular

    Joined:
    Nov 6, 2013
    Messages:
    2,949
    Likes Received:
    1,669
    Location:
    France
    Because of the very bad image quality of the footage (from the 4K images I have seen on DF and gamersyde) I really don't see how they could give such a definitive statement for the whole demo...
     
  4. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    There was some speculation on 32 ROPs not being quite right, although 64 vs 32 seems more than a little off.
    Some interesting interpretations are that items not used in compatibility mode like the ID buffer might be take over part of the ROP hardware in full mode. That might mean there's some asymmetry, or it's symmetric with the Pro not actually getting 64 pixels per clock in standard fill rate.

    I think reviews for the existing APUs mentioned there's something of a hacky relationship between the GPU and CPU domain. There's a GPU memory controller that then plugs into the main memory controller.
    If that is the case, the GPU memory clients would directly attach to a channel in that controller, and then that would then plug into the actual memory controller.

    I'm not sure what happens with the consoles.
    Vega's change in making ROPs L2 clients and adding the infinity fabric might make the multiple controller setup unnecessary.
     
  5. shredenvain

    Regular

    Joined:
    Sep 12, 2013
    Messages:
    921
    Likes Received:
    189
    Location:
    Somewhere in southern U.S.
    Oh I don't disagree. They said they were given direct feed but only at 29. Something fps. Plus this slice of the gameplay could not have enough performance hits to warrant the drop in resolution. The finished product could wind having a lot of instances of a dynamic framebuffer
     
  6. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,322
    Likes Received:
    1,120

    Not really equally. AC Origins uses dynamic scaling and will scale down less on X1X according to the developer. Buried somewhere in this article.

    https://www.windowscentral.com/xbox-one-x-demonstrates-real-value-it-ps4-pro-competitor-or-not?amp

    We're getting into splitting hairs at that point. But anyway you slice it, X1X is just more powerful. How much the extra power is used will depend.
     
    RootKit and BRiT like this.
  7. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    It should help in more situations than "only" a single shader bottleneck. Async would balance the load from all concurrent shaders. So even if an individual shader isn't ALU bound, it should benefit any concurrent work.

    Indirectly, even if no ALU bottleneck exists, packed FP16 could still result in faster performance by consuming less power. Allowing for an increase in clockspeed that will directly help with any bottleneck. That's difficult to quantify, but with cards throttling to a specific TDP it's a valid optimization strategy.

    Wasn't the bus more along the lines of 256+128 for GPU and CPU portions of the APU respectively? One memory controller, but partitioned for the different components. The GPU being roughly Polaris 10 with a CPU attached to additional channels. ROPs would therefore map to a 256bit bus directly.
     
  8. Recop

    Veteran Newcomer

    Joined:
    Aug 28, 2015
    Messages:
    1,259
    Likes Received:
    613
    I highly doubt that even at XB1 settings, the X has the power to hit native 4k in sub-native 1080p games : "With that being our focus, we’re running at 4K 30FPS for Campaign/Horde and 4K 60FPS for Versus with adaptive scaling to ensure a rock-solid frame rate that fans expect from our head to head multiplayer."

    http://www.neogaf.com/forum/showpost.php?s=194d898d9bafcbe6f6e044c89e713c9a&p=240179660&postcount=1
     
  9. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,122
    Likes Received:
    2,873
    Location:
    Well within 3d
    I have not seen this claimed.
    I haven't seen details more specific than the headline numbers, but it doesn't strike me as helpful to make this split.
    Getting 32 ROPs to map to a 384-bit bus has already been done, while getting 2 MiB of L2 cache to match this has not. 128 bits of GDDR5 seems like overkill for the CPU portion as well.
     
    BRiT likes this.
  10. shredenvain

    Regular

    Joined:
    Sep 12, 2013
    Messages:
    921
    Likes Received:
    189
    Location:
    Somewhere in southern U.S.
    Yeah I've seen that quote from the Coalition but the way it reads is unclear. To me it reads that the campaign will run at 4k 30fps and the multiplayer will run at Dynamic res and 60fps. It isn't clear though. And I'm not saying that all 1080p Xone games will run at 4k native on the OneX. I'm just saying that I would prefer checkerboard or dynamic res with higher quality geometry, draw distance and effects over native 4k any day.
     
  11. Recop

    Veteran Newcomer

    Joined:
    Aug 28, 2015
    Messages:
    1,259
    Likes Received:
    613
    I understood the same thing :

    - Native 4k for the single player mode
    - Dynamic 4k for the multipayer mode

    And it's quite understandable since the multiplayer doesn't run at 1080p on XB1. There's the same dynamic resolution on the base model.
     
  12. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    1,943
    Likes Received:
    1,089
    gears on 1X is not running XO settings. So if anything this goes against what your saying.
    gears on 1X is running with much improved settings at 4K that's including the fact that even on XO it also had dynamic res implemented, not sure how often it dropped resolution, but we also don't know how often it drops on 1X either.
     
    RootKit likes this.
  13. Recop

    Veteran Newcomer

    Joined:
    Aug 28, 2015
    Messages:
    1,259
    Likes Received:
    613
    According to DF, the single player mode runs at 1080p most of the time on XB1.

    But indeed you're right, i read too fast. However, the biggest improvements seem to be reserved for the single player mode : "Many of the improvements to Campaign also make it to Versus and Horde, including 4K, HDR, higher resolution textures, improved draw distances, and Dolby Atmos Support."
     
  14. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    1,943
    Likes Received:
    1,089
    think there was also things like higher geometry also.
    somewhere on site it has list of changes, but here it is in interview format
    https://gearsofwar.com/en-us/community/gears-4-xbox-one-x
     
    RootKit and BRiT like this.
  15. Recop

    Veteran Newcomer

    Joined:
    Aug 28, 2015
    Messages:
    1,259
    Likes Received:
    613
    Yeah, but like i said those improvements seem to be only present in the single player mode.

    On the multiplayer, you only have better textures + better draw distance.
     
  16. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    1,943
    Likes Received:
    1,089
    doesn't actually mean these are the only things.
    but even if it where the higher draw distance is something that affects geometry.
    also fact is maybe its cpu bound for that game.
    at 30fps we know for a fact it has a lot higher quality settings, and AC:O, D2(can't remember which we was talking about) both seem to be 30fps games, so at XO settings may have reached 4k native with optimization and work.

    Gears has surpassed that. Shame we don't get to see it already though. Should be providing high quality captures to DF for our entertainment :smile2:
     
    RootKit likes this.
  17. Recop

    Veteran Newcomer

    Joined:
    Aug 28, 2015
    Messages:
    1,259
    Likes Received:
    613
    You don't know what is the resolution of ACO on XB1. I guess it's 900p, so it's not comparable to Gears 4 which runs at 1080p most of the time on XB1, at least for the single player.

    But more generally, native 4k has 100% more pixels than 2160C (best efforts on Pro). I don't see how the X could output 100% more pixels than the Pro without some serious downgrades.

    If we take HZD as an example, i think that native 4K is impossible on X even with the same assets used by the Pro. I could be wrong though, but once again, we're talking about 100% more pixels.
     
  18. Jay

    Jay
    Veteran Regular

    Joined:
    Aug 3, 2013
    Messages:
    1,943
    Likes Received:
    1,089
    correct, that's why when you said 1X couldn't do AC:O at XO setting native 4k, I disputed that and said you don't know that.
    I expect any XO 1080p game at XO settings and fps can easy run on 1X and 900p may be possible with work and optimization.

    you then tried to use gears to prove your point, but gears is doing a lot more than just XO settings, and MP unsure just how much more but still higher than XO settings. Which in turn proved it could do it.

    Now your using HZD, and saying it would require a serious downgrade. I'm sure it would.
    But it would be able to run it at XO settings whatever they may be running at 1080p, and possibly 900p. how much work is required for it if it was running at 900p is unknown (and may not be possible)

    the point is 1X can run XO games native 4k at same settings, 900p may also be possible but would require work.
    Which may not be worth investing in when 1800p upscaled may be good enough or not even worth trying if engine already implements CBR.
    it didn't do too badly scaling up a 720p60 game XO settings to native 4k around 40fps without much optimization I believe.

    so native is possible and because it's using CBR doesn't mean it couldn't run the XO version at native 4k. It's using CBR because studio has invested in that tech for their engine, which means get more for slight drop in IQ
     
    BRiT likes this.
  19. HMBR

    Regular

    Joined:
    Mar 24, 2009
    Messages:
    416
    Likes Received:
    105
    Location:
    Brazil
    not really, you can't take it that literally or we would also have 512bit (twice the MCs) memory and many other things doubled
    if you look at how typically AMD makes other cards based on the same physical die, they disable SPs/TMUs and keep the ROPs unchanged (like 7870 vs 7850, 480 vs 470 and so on), I think 32ROPs it's pretty much the correct spec for the PS4 Pro,
     
  20. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    That's true, but not as simple as it sounds like.Running fp16 instead of fp32 code doesn't directly allow the CU to run more concurrent threads (maximum is still 40 waves). fp16 will save some register space, and this can lead to more concurrency (more waves fit to register file at same time). Also fp16 math completes the heavy ALU portions twice as fast. However if the shader was bound by something else than ALU, this actually means that the ALU can't hide as much latency anymore. If the GPU simply runs more waves of the same kernel, FP16 doesn't help at all in this case (every wave just waits more). But if a CU has mixed workload (only possible on AMD GPUs) containing waves from multiple kernels then FP16 helps, because waves hitting the bottleneck will reach the next memory/filter operation sooner -> wait sooner, allowing waves from other kernels to run more frequently on the same CU.

    FP16 is definitely better with games/engines using lots of async compute or compute overlap.
    No console has ever had turbo clocks based on TDP. Saving power doesn't directly give you any performance gains. Power saving is however very important strategy on mobile phones. Throtting can result in more than 50% GPU performance drop on modern flagship phones. Double rate FP16 is great when you need to race to sleep. But lately desktop GPU IHVs have also introduced TDP based turbo clocks: Modern Nvidia desktop GPUs have pretty high turbo clocks. AMD still doesn't, but Vega's 1.5+ GHz clock rate points out that AMD is following the suit. It is definitely going to be worth thinking about TDP in future high end GPU code. Saving bandwidth and ALU in shaders where those are not bottleneck is going to be wise.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...