far cry ps3 and stuff

Discussion in 'Architecture and Products' started by pocketmoon66, Jun 29, 2004.

  1. jb

    jb
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,636
    Likes Received:
    7
    Why where does LeGreg work at? Come on tell us :)
     
  2. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Mature? In what way?
     
  3. poly-gone

    Newcomer

    Joined:
    May 22, 2004
    Messages:
    93
    Likes Received:
    0
    HDR is not about simply accumulating colors which finally amount to more than 1.0f. By doing that, all you'll get is a washed out scene. HDR also includes "tone mapping", which scales the over-bright scene into the viewable range. You will need shaders to do this.

    Apart from tone mapping, there's also the bloom effect, which simulates blooming and color bleeding of extremely bright areas of the scene. You'll again need shaders to do this efficiently. So, at some point you WILL NEED shaders to "do" HDR. FP blending is NOT the replacement "technique" for shaders. It only simplifies things a little.
     
  4. Mr. Travis

    Newcomer

    Joined:
    May 28, 2004
    Messages:
    25
    Likes Received:
    0
    darn crytek... keep showing off more and more features and doing less and less to get the sdk out :x
     
  5. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    I'm saying that both solves the problem, but if you're using one there's no point in using the other. If you've already culled unlit fragments with the stencil test, there's no need anymore to do an early out from a ps3.0 shader.

    This technique is really quite general. It can be used in any situation where you're optimizing by an if-statement in ps3.0. Just render "if" in a separate shader in the first pass, and in the second pass you render "then", and "else" in a third pass if you need it. Since the hardware does early culling of fragments based on the stencil test the cost of this is hardly any higher than if it would have been a single pass.

    The situations where this can't be applied is atypical scenarios. I have a hard time coming up with one. I was going to say Mandelbrot rendering where you drop out as soon as length(z) > 2, but now that I think of it you can apply this technique even in this situation. It will be a bit tricker, with rendering back and forth between render-targets, and you'll probably have to use several loops granuality to see any speed-up, but it should be doable. In this case ps3.0 may win however, but in the common situations however, such as speeding up lighting, I'm confident this technique will come pretty darn close to ps3.0 or even beat it, depending on how costly dynamic branches are on nVidia's hardware.
     
  6. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    Don't stare yourself blind on the number of passes, because that's the point of this technique. The overhead of multipass is near zero. Even if ps3.0 can do it in a single pass, it's not doing much less actual work. (Well two passes actually, because even with ps3.0 you'd still want to do that depth-only pass).

    Imaging a scene with a light. The light has a limited radius, so only half the scene is lit by this light. Any part of the scene that's beyond for instance 100 units is completely in the dark, and thus need not be shaded.

    So in ps3.0 you simply check if length(lightVec) < 100, and if so you go through the usual lighting code, otherwise immediately return zero. This means that the total workload is that the if-statement is run for the whole scene, and lighting for half the scene.

    With my technique, you first render the same if-statement for the whole scene. Then in the next pass you draw the lighting where stencil = 1. Total workload is if-statement for the whole scene plus lighting for half the scene, which is the same as in the ps3.0 case.

    So it boils down to the question of what's more costly, cycles spent on dynamic branching, or cycles spent on early culling with stencil. And I'm not so sure dynamic branching will turn out as the winner of that battle, cause stencil culling is really fast. I'm not sure if our hardware does something similar to Hierarchical-Z with stencil too and culls full tiles (maybe someone who knows can fill in), but my guess is that it does this, cause the cost is really very low.
     
  7. pat777

    Newcomer

    Joined:
    May 19, 2004
    Messages:
    230
    Likes Received:
    0
    I've heard someone say that your technique isn't really new. I think nVIDIA can switch back between your technique and dynamic branching depending on which technique is more suitable for the given situation.

    I still think there are a lot of advantages of dynamic branching(other than this situation) to be discovered. We haven't used anywhere near half the effects PS2.0 is capable of. I'm sure there's a lot of useful effects that can be dramatically sped up by dynamic branching.
     
  8. DSN2K

    Newcomer

    Joined:
    Oct 4, 2003
    Messages:
    146
    Likes Received:
    3
    I knew waiting to buy this would work out better.

    Im going play Far Cry in its fall glory.... 8)
     
  9. FUDie

    Regular

    Joined:
    Sep 25, 2002
    Messages:
    581
    Likes Received:
    34
    If you had a Radeon 9500 or better, you could have been enjoying Far Cry in its "fall glory" (sic) all along. ;)

    -FUDie
     
  10. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    I seriously doubt a 9500 can run Far Cry at its best.
     
  11. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    Well, I'm sure someone have thought of it before. It's pretty simple so it wouldn't surprise me. If nothing else I have to credit my colleague Guennadi who initially brought up the idea.

    Sure this technique would work on nVidia cards too (assuming that they do top of the pipe stencil culling), but there's no way the driver can just decide to turn a dynamic branching shader into this technique.
     
  12. 991060

    Regular

    Joined:
    Jul 29, 2003
    Messages:
    640
    Likes Received:
    2
    Location:
    Beijing
    Humus,wouldn't your technique place more burden on vertex shader and memory bandwidth? I remember R3xx/R4xx's H-Z can only be effecient when both depth and stencil buffer is cleared, so here's another question. And what if you want to simulate nested branches? Does the technique still apply?

    edit: just realized the H-Z wouldn't be a problem, you have a depth-fill pass. :wink:
     
  13. cho

    cho
    Regular

    Joined:
    Feb 9, 2002
    Messages:
    422
    Likes Received:
    16
    Humus, will Crytek adopt your method to speed up the shadow render of Far Cry ? :wink:
     
  14. Evildeus

    Veteran

    Joined:
    May 24, 2002
    Messages:
    2,657
    Likes Received:
    2
    FYI:
    http://www.farcry.ubi.com/
     
  15. FUDie

    Regular

    Joined:
    Sep 25, 2002
    Messages:
    581
    Likes Received:
    34
    My point was that you don't have to wait for a patch to access all the features of the engine. NVIDIA cards are running with PS 1.1 in place of many (all?) PS 2.0 effects.

    -FUDie
     
  16. Evildeus

    Veteran

    Joined:
    May 24, 2002
    Messages:
    2,657
    Likes Received:
    2
    You can change to the R3** path (at a cost of performance on Nv3*)
     
  17. _arsil

    Newcomer

    Joined:
    Mar 7, 2003
    Messages:
    11
    Likes Received:
    0

    This technique isn't new. See:

    PEERCY, M. S., OLANO, M., AIREY, J., AND UNGAR, P. J. 2000.
    Interactive multi-pass programmable shading. Proceedings of ACM SIGGRAPH 2000

    It is also mentioned in:

    Timothy J. Purcell Ian Buck William R. Mark Pat Hanrahan
    Ray Tracing on Programmable Graphics Hardware

    and dozens other papers...


    It isn't also replacement for flow control in pixel shaders. Simulating flow control using some early tests (like stencil or Z-Buffer) requires multi-pass rendering, that means you have to pass all geometry multiple times. If your vertex shaders are complex or you have a lot of geometry it isn't a solution.

    Also simulating nested loops, nested ifs and jumps isn't an easy problem and will certainly break your shader into 10 or 15 passes.
     
  18. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    Sorry but I fail to see what's new is Humus's technique. Developers do that kind of 'tricks' all the time. It's a pretty basic and well known technique :oops:
     
  19. 991060

    Regular

    Joined:
    Jul 29, 2003
    Messages:
    640
    Likes Received:
    2
    Location:
    Beijing
    Humus, if possible, I'd like to take a look at your demo's source code, maybe I can convert it to SM3.0, then we can see which method is better. :lol:
     
  20. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    3,244
    Likes Received:
    3,408
    All? :D Stop that, it's not even funny anymore. Everybody in the world knows how you love ATI by now...
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...