Digital Foundry Article Technical Discussion Archive [2015]

Discussion in 'Console Technology' started by DSoup, Jan 2, 2015.

Thread Status:
Not open for further replies.
  1. DJ12

    Veteran

    Joined:
    Oct 20, 2006
    Messages:
    3,105
    Likes Received:
    198
    Three face please in the interests of fairness. Even though it's pulled, the Master Race needs to know their (I suppose I should say our as I exclusively game on the PC now) version sucks balls.

    NX has a first contact, resolution difference and slight performance disparity for xbox one, PC version missing lots of effects and not running very well on his 7870 AMD system.
     
    BRiT likes this.
  2. function

    function None functional
    Legend Veteran

    Joined:
    Mar 27, 2003
    Messages:
    5,727
    Likes Received:
    4,003
    Location:
    Wrong thread
    Console ports are increasingly kicking the PC in the PCI-E bridge.

    Maybe DX12 and the better control of transfers into video memory that it's supposed to allow can save us...
     
  3. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,853
    Likes Received:
    3,525
    Location:
    Guess...
    They've already published 2 articles highlighting the mess that is the PC version so I'm not sure that's a fair assessment. If they were to do a face off right now, they would be right to exclude the PC version altogether, especially since it's been withdrawn from sale.
     
  4. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,853
    Likes Received:
    3,525
    Location:
    Guess...
    What other games have suffered in this way recently? And what makes you think it has anything to do with PCI-E?
     
  5. DSoup

    DSoup Series Soup
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    15,380
    Likes Received:
    11,491
    Location:
    London, UK
    Indiana Jones and the Fate of Atlantis!
     
  6. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,296
    Location:
    Helsinki, Finland
    Every cross platform studio has some dedicated PS4 and Xbox One developers, especially in the rendering team. Using D3D style register bindings can reduce the code maintenance cost, if your code base is influenced by DirectX. Some studios might prefer this (if they are not CPU bound and need the time to optimize the GPU side better). We have our own rendering API and use code generation instead. Generator emits optimal (branchless) resource binding code for each platform. We always use the closest to metal API on every platform. I know other cross platform developers who do the same.

    Now that both consoles have GCN based GPUs and Jaguar CPUs, cross platform developers no longer need to split their optimization efforts between two completely different platforms. Most optimizations help both platforms similarly.
     
    #1086 sebbbi, Jun 26, 2015
    Last edited: Jun 26, 2015
  7. function

    function None functional
    Legend Veteran

    Joined:
    Mar 27, 2003
    Messages:
    5,727
    Likes Received:
    4,003
    Location:
    Wrong thread
    Far Cry 4 springs to mind, just from reading about it (don't own it). Texture streaming seems to be the culprit there.

    I'm pretty sure I've read amongst the DX12 previews articles that explicitly moving textures in and out of GPU memory using DX 11 / 10 / 9 is high overhead, works best with large blocks (whole mipmap levels perhaps?) and can cause performance issues.

    With DX12, controlling movement of data into GPU memory is finer grained and lower overhead, and less likely to interfere with other data being sent to the GPU.

    PCI-E BW should be enough for texture streaming even at high resolutions and frame rates if the transfers could be handled well enough. Just look at what virtual texturing can achieve with a paltry cache of a few MB and mechanical HDD transfer rates!

    Suffice to say, I have high hopes for DX12.
     
  8. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,853
    Likes Received:
    3,525
    Location:
    Guess...
    It runs fine now (I've got it and it's completely stutter free on my 2GB GTX670, pretty much maxed out at higher than console settings). I understand it stuttered when first released but that was resolved after a few patches. No doubt PC's are harder to optimise for than consoles given the limitations of DX11 and the varied hardware configurations but I'm not seeing a fundamental limitation of PCI-E unless we're talking about latency sensitive GPGPU operations - which with DX12 would be possible on integrated GPU's.
     
  9. psorcerer

    Regular

    Joined:
    Aug 9, 2004
    Messages:
    732
    Likes Received:
    134
    What "code base"? It's not an enterprise with legacy code.

    Usually debugging generated code is a real nightmare.

    If they are not CPU bound, are they by any chance GPU bound? I want to see these heroes with my own eyes!

    That was also the case in previous generation. But slightly different: people, who wrote good PS3 code, also wrote a good X360 or PC one.
     
  10. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,296
    Location:
    Helsinki, Finland
    Texture (and other data) streaming to GPU memory is often a cause of stuttering in PC games. This is mostly because DirectX abstracts the resource management (Java / garbage collection syndrome). With abstract resource management the GPU driver has no clue what textures you need in a certain level area. When you bind a resource to the GPU, and it is not resident in the GPU memory (you can't even ask for this), the driver notices that it is missing, starts the texture upload and with high likelihood stalls the frame rendering (if a big texture is missing or multiple small ones). Good manual texture management uses engine knowledge about level design and moves textures to GPU memory just ahead of time, avoiding these stalls.
    Copy queus make movement cheaper and lower latency. But most importantly the game engine can tell the GPU what data is needed instead of employing driver side black magic to guess it.
    Big game engines have huge amount of legacy code. Even the first party console studios don't rewrite their whole code base for every project. We are talking about code bases of several million lines here. It would not be commercially viable to rewrite it all during a single project.
    Not if you just generate the small platform specific command creation part, and if you employ techniques to make debugging easier. Some studios even employ code generation to make runtime code editing faster and easier, improving the iteration time. Code generation can be used to made debugging easier instead of harder when used properly.
    SPUs certainly forced people to think about data access patterns and optimize the crap out of the data movement between memory and local store. This of course helps all cache based architectures, especially ones like Xbox 360 that require manual cache prefetching to perform well. However Xbox 360 VMX128 code needs a lot of special care to work well. SPUs do not LHS stall for example. And SPUs have lower instruction latency. Compiler needs lots of parallelism for VMX128 inside each loop body to generate good code (and utilize that huge pool of 128 vector registers). SPU code doesn't require that much unrolling and other tricks to perform as expected.

    Current generation allows you to use exacly the same optimized CPU code on both platforms. This has never been possible before. When you optimize a loop with AVX intrinsics that code can be used on both consoles. When you optimize some data set to fit to the L1 and L2 cache better it helps both platforms identically, since they both have Jaguar CPUs with same caches (same size and associativity). When you optimize around CPU bottlenecks and quircks, both consoles can use the same code. This is a big improvement for cross platform developers.

    On the GPU side, you also can optimize the shader code once (for GCN), and expect minimal extra modifications based on platform. On PS3 you had to be extra careful about 32 bit ALUs and brancing, interpolants, etc. Xbox 360 GPU allowed more advanced techniques, but only if you had the time to fully rewrite your lighting/etc code for PS3. Some devs did lighting and post processing on SPUs (very different code indeed compared to Xbox 360 shader code).
     
    #1090 sebbbi, Jun 27, 2015
    Last edited: Jun 27, 2015
  11. TheWretched

    Regular

    Joined:
    Oct 7, 2008
    Messages:
    830
    Likes Received:
    23
    I've played around a bit with Batman these days (got it for free with my GPU)... it's an interesting beast, really.
    So, I should easily be getting an average of about 60Hz at 1080P or thereabouts. Or something along the lines of 30Hz at 4K with reduced options (which there aren't many of in Batman).

    Either way... the "PC Perfomance Test". For comparisons sake, I ran it twice. With VSync on!

    4K
    First run: 19fps lowest, 28fps average
    Second run: 11fps lowest 21fps average

    That's a disparity, I can't really believe. I know, VSync is partly to blame here, as I might've just so managed to go above 30Hz in the first test, and was barely running below 30Hz in the second, but... the lowest one is... impressive. Also from the get go, it used north of 8GB of RAM and the second run topped out at north of 9GB. No to mention the heavy swapping within my GPU (970 with its 3.5GB of fastram)

    Looking at The Witcher 3. I can manage 4k30Hz with medium to ultra details (and disabling all the IQ destroying post processes). And that game has such a long view distance... it really boggles my mind how Batman can't reach that, even in static scenes, where there's no real texture streaming happening.
     
    pjbliverpool likes this.
  12. psorcerer

    Regular

    Joined:
    Aug 9, 2004
    Messages:
    732
    Likes Received:
    134
    If you employ scene management and other "change management" code in your "engine" then I can see why it is millions of lines long (you do not "draw mesh", you need to "place it the scene", set up "modes", and other dependencies, etc.). But if you just immediately "draw triangles" (that's what hardware is optimized for) I don't see why the code should be so complex.
    Legacy code for current consoles is "being bound by D3D-thinking".

    Depends on what you call "code generation". If it's some clever component-based templates I totally see why it can be easeier and faster, but if it's a real code generation...
    Hmm, on the other hand I do remember using codegen to get some aspect-like behavior useful for debugging, maybe you're right...

    Yeah, "backward compatibility" MSFT initiative will show just how many games used "manual optimizations" for that (I mean there is no way to emulate 3GHz CPU on 1.2GHz one if threads are that optimized).
    But judging from MSFT even coming with this initiative I would say that they believe X360 CPU was heavily underused...

    Then you'd better move it to GPU compute and forget about it.

    Then you doing it wrong. There is no performance to find there. Just use compute.
    And, to prevent "it's GPU bound", until this day I have never seen a GPU-bound game in real life (GPU-bound = uses 100% of all GPU ALUs all the time).
     
  13. Shortbread

    Shortbread Island Hopper
    Legend Veteran

    Joined:
    Jul 1, 2013
    Messages:
    5,393
    Likes Received:
    4,514
    Batman Arkham Face-Off

     
    DSoup likes this.
  14. Allandor

    Regular Newcomer

    Joined:
    Oct 6, 2013
    Messages:
    690
    Likes Received:
    659
    No, gpu bound means that the bottleneck is at least in one part of the gpu. You will never utilize a gpu 100%, if you really mean 100%. You can't even use the ROPs 100%, because of memory bandwidth.
     
  15. Globalisateur

    Globalisateur Globby
    Veteran Regular Subscriber

    Joined:
    Nov 6, 2013
    Messages:
    4,304
    Likes Received:
    3,194
    Location:
    France
    Eh, now that we are definitely heading toward a spin-off thread, I wanted to know, while the ROPs are fully using the main memory bandwidth, is that still possible to make the ALUs process something only within its GPU caches?
     
  16. Billy Idol

    Legend Veteran

    Joined:
    Mar 17, 2009
    Messages:
    6,032
    Likes Received:
    873
    Location:
    Europe
    Open world game with relatively lots of physics: where is the CPU advantage of the One?
     
  17. Starx

    Regular Newcomer

    Joined:
    Sep 29, 2013
    Messages:
    294
    Likes Received:
    148
    Batman Arkham Knight Receives patch on PC
    Chanegelog:
    - Fixed a crash that was happening for some users when exiting the game
    - Fixed a bug which disabled rain effects and ambient occlusion. We are actively looking into fixing other bugs to improve this further
    - Corrected an issue that was causing Steam to re-download the game when verifying the integrity of the game cache through the Steam client
    - Fixed a bug that caused the game to crash when turning off Motion Blur in BmSystemSettings.ini. A future patch will enable this in the graphics settings menu
     
  18. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    43,576
    Likes Received:
    16,034
    Location:
    Under my bridge
    What advantage would you expect to see? If the CPU requirements are capped at what PS4 is capable of (enforced parity), no advantage would be visible. From the sounds of it, the game is well balanced to not put moments of considerable framerate-trashing stress on the CPU, and it's only the occasional GPU spike that hampers the 30 fps.
     
    BRiT likes this.
  19. chris1515

    Legend Regular

    Joined:
    Jul 24, 2005
    Messages:
    6,737
    Likes Received:
    7,330
    Location:
    Barcelona Spain
    Maybe part of physics run on the GPU. Raycasting for visibility or others task, cloth physics like Ubi, other parts are good task for GPGPU.
     
    #1099 chris1515, Jun 28, 2015
    Last edited: Jun 28, 2015
    psorcerer likes this.
  20. Billy Idol

    Legend Veteran

    Joined:
    Mar 17, 2009
    Messages:
    6,032
    Likes Received:
    873
    Location:
    Europe
    If they choose parity for CPU, why not parity for GPU then??

    What advantages I'd expect? Well, lots of NPCs (AI) in an open world game, lots of physics...shouldn't this put a stress on the CPU?

    Didn't we here the NPC argument in the case of AC Unity?

    Can't sudden physics interaction spike CPU usage?
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...