Killzone 2 technology discussion thread (renamed)

Discussion in 'Console Technology' started by Terarrim, Jun 12, 2007.

Thread Status:
Not open for further replies.
  1. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    btw..hasn't anyone noticed that according the presentation they are actually doing supersampling in the lighting pass? not going to give any particular advantage as sub samples will mostly have the same values per pixel but if they implement a more clever scheme for the lighting pass they will likely see big speed improvements.
     
  2. HumbleGuy

    Newcomer

    Joined:
    Jul 27, 2007
    Messages:
    9
    Likes Received:
    0
    Maybe they have edge almost on every pixel so it was not worth it - that would be in line with that 1mil triangles. Joking of course... :grin: :grin:
     
  3. HumbleGuy

    Newcomer

    Joined:
    Jul 27, 2007
    Messages:
    9
    Likes Received:
    0
    ...hit Submit instead of Preview...

    nAo - how would you do this? Shader branching or somewhere outside the shader?
    Presentation claims that they get optimization in sharing shadow taps between samples to get performance comparable with non-MSAA case.
     
  4. Laa-Yosh

    Laa-Yosh I can has custom title?
    Legend Subscriber

    Joined:
    Feb 12, 2002
    Messages:
    9,568
    Likes Received:
    1,455
    Location:
    Budapest, Hungary
    Also consider that they have to fill the G-buffer for each frame. It's a quite large buffer, already two and a half times as big as a simple forward renderer with single pass lighting... and then a forward rendering pass...

    Then again, it seems they've only dropped specular color and reflections, which would only be important for some materials (metals work better with it, for example), and apart from the 2x AA they can also implement an edge blur filter based on the depth/normals pass, which would smooth out the image a bit more (a common trick in offline CG compositing, too). Or they might already do this, considering the low amount of aliasing in the screenshots.

    And also, they've made a very clever artistic decision with the post processing pass, which can greatly enhance the results and give it more of a stylized, offline CG-look. I'd even risk that id added their post stuff to Rage after seeing this - they've mentioned that it was a recent and overnight development ;)
     
  5. morlock

    Regular

    Joined:
    Dec 17, 2006
    Messages:
    275
    Likes Received:
    0
    Location:
    Sweden
    So all physics on the PPU, and two SPU not in use? Does the blanks mean that the SPUs idles alot?

    [​IMG]
     
  6. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    I'd run a full screen pass that read all the subsamples of one particular g-buffer parameter (say albedo) and that generates a stencil mask that can used later to early reject all those pixels that don't have the same subsamples, so that you could shade every light twice, one pass processes only one subsamples, the other one works on both subsamples (the same mask could be reaused for deferred shadowin as well..cutting shadowing times).
    A shader that generates such a mask would be something like that:

    Code:
     half4 generateStencilEdgesMask( sampler2D multisampledBuffer, float2 uvLeft : TEXCOORD0, float2 uvRight : TEXCOORD1)
    {
       half3 leftSample = tex2D( multisampledBuffer, uvLeft);
       half3 rightSample = tex2D( multisampledBuffer, uvRight);
    
       if ( leftSample != rightSample)
          discard;
    
       return half4(0.0f,0.0f,0.0f,0.0f);
    }
    disable color writes and z writes, set stencil op to do something (like writing some costant value to the stencil buffer) as long it's something that can be mixed later with lighting operations that already use stencil buffer to reject pixels not affected by lights.
    Obviously this stuff is not meant to work for real, it's purely fictional, but it should give a rough idea of what I'm talking about.
    EDIT: just realized that this stuff only works if current multisampling implementations supersample depth AND stencil, and not just depth. Unfortunately I dont' know if they do that, though it would make a lot of sense
     
  7. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    I'd would say that their post processing stuff is the most innovative part of their work..too bad they didn't talk much about it :)
     
  8. "Nerve-Damage"

    Regular

    Joined:
    Nov 24, 2005
    Messages:
    809
    Likes Received:
    14

    Same thing that I was thinking. Lots of idle time indeed...

    * SPU 4???
    * SPU 5???
    * SPU 6 (OS)
     
  9. Arnold Beckenbauer

    Veteran Subscriber

    Joined:
    Oct 11, 2006
    Messages:
    1,756
    Likes Received:
    722
    Location:
    Germany
  10. HumbleGuy

    Newcomer

    Joined:
    Jul 27, 2007
    Messages:
    9
    Likes Received:
    0
    Thanks, this is what I was missing in my bandwidth calculations completelly!

    As far as I remember the talk, he said that game, AI and physics all run number of own SPU tasks and the gray boxes supposely are these tasks. He did not focus on this part at all (as not being part of rendering) so I suppose it's there to just give a rough idea that there is SPU activity before PPU hits the rendering part. I would say physics of that scale has to be on SPU.

    The flow of that part of presentation was roughly:
    "PPU orchestrates game logic, AI and physics (with their own SPU tasks), then there is time to prepare draw data that changed (again with some SPU tasks). Display list generator is launched as soon as possible on SPU and it launches own sub-tasks that do skinning, edge geom... then there is shadow map rendering in parallel. And in the meantime PPU moves to update logic for next frame (the red lock thing + next bars on PPU side)..."
    As I remember it all of the "colored" SPU bars belong to rendering of one frame (so not to the PPU stuff after (and including) the "data lock" bar) - so rendering leaves the PPU at single point during the "Prepare Draw" bar and PPU can do general stuff for next frame.

    He did not mention anything about the picture actually showing real load and to me it just looks like very rough high level overview for the masses.

    Fran was there as well so he might fill what I missed.

    Thanks nAo. That sounds like very practical and logical solution. I wonder - maybe they were doing it already but traded it for early stencil culling used for light volume and "sun" (considering how they rely on early stencil culling in light pass).
    I would say both are proper per-sample on edges (depth and stencil are stored 32bits in the end).
     
  11. Laa-Yosh

    Laa-Yosh I can has custom title?
    Legend Subscriber

    Joined:
    Feb 12, 2002
    Messages:
    9,568
    Likes Received:
    1,455
    Location:
    Budapest, Hungary
    Obviously they've only talked about rendering related stuff, the rest of the SPU processing time is dedicated to game code, animation blending, AI and such...
     
  12. morlock

    Regular

    Joined:
    Dec 17, 2006
    Messages:
    275
    Likes Received:
    0
    Location:
    Sweden
    HumbleGuy: Thanks for clearing some things up, I would guess that either they have lots of room for improvement or don't talk about the entire engine in that slide, where the latter is the most logical. :)
    Are there any games out now that have "contact shadows". It should improve object belonging to the world i'd reckon. Characters really touching the surfaces and such?

    The todo-list sounds like some pretty workheavy stuff for the engine.
     
  13. Laa-Yosh

    Laa-Yosh I can has custom title?
    Legend Subscriber

    Joined:
    Feb 12, 2002
    Messages:
    9,568
    Likes Received:
    1,455
    Location:
    Budapest, Hungary
    To me it seems that it's the result of Guerilla taking a long, thorough look at the compositing pipeline at Axis Animation, about what they've done to their raw renders for the E32005 trailer. Color correction can go a long way in the hands of some good artists, and it helps to get the look of separately textured assets more consistent. You can also use it to simulate fog and atmosphere.
    I can check out what our compositors are doing in more detail on monday if you're interested ;) but unfortunately no before/after images...
     
  14. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    There are games out there that does 'contact shadows'.
     
  15. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    yep, one 32bit z buffer and four RGBA8 color buffers, times 2, as they have 2x multisampling.
    That's 40 bytes per pixel -> 36 megs at 720p resolution
     
  16. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    Note from the SPUs usage slide that they use EDGE to cut the number of triangles pushed to RSX
     
  17. Diesel2

    Newcomer

    Joined:
    Jun 14, 2005
    Messages:
    88
    Likes Received:
    2
    The PDF specifically mentions Game AI Physics within the PPU and 4 SPE's.
     
  18. dantruon

    Regular

    Joined:
    Apr 5, 2004
    Messages:
    487
    Likes Received:
    2
    HOLY COW !!! what does it all mean though? :lol:
     
  19. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    I think they just went for the most straightforward solution, the game will be out next year and they still have time to implement more clever AA schemes, the nice thing is how flexible a last year console can be compared to even more modern DX10 hw.
     
  20. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    It means that a deferred renderer consumes more frame buffer memory than a common forward renderer
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...