forced FP16 on HL2 fixes FX performance problem???

Discussion in '3D Hardware, Software & Output Devices' started by ^eMpTy^, Nov 30, 2004.

Thread Status:
Not open for further replies.
  1. radar1200gs

    Regular

    Joined:
    Nov 30, 2002
    Messages:
    900
    Likes Received:
    0
    If the shaders don't use nrm, then that is even more proof (as if it were required...) that HL2 is coded for one gpu vendor and one gpu vendor only.
     
  2. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,090
    Likes Received:
    694
    Location:
    O Canada!
    The shader might not require the instruction. As for being "coded for one GPU", remember, these are the guys that enabled the Z Reject path by default for NVIDIA but not for ATI, despite larger performance increases in graphics limited situations.
     
  3. Beren_PCE

    Newcomer

    Joined:
    Dec 2, 2004
    Messages:
    8
    Likes Received:
    0
    Location:
    Zagreb, Croatia
    What makes you say that? ;)
     
  4. radar1200gs

    Regular

    Joined:
    Nov 30, 2002
    Messages:
    900
    Likes Received:
    0
    My point is simply this:

    Valve, as far as I can tell has deliberately set out to write DX9 shaders that favor ATi GPU's over everone elses.

    This can be done a variety of ways, from the ordering of the instructions, omitting instructions and options etc.

    On top of that, they haven't provided people with the best Dx8 shaders that they could have. Witness the work of a 16 year old compared to Valves DX8 shaders.

    The Z issue affects a different part of the graphics pipeline. It's possible to selectively optimize parts of your code, while deliberately ignoring the parts you know will make the most visual difference to end-users.
     
  5. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    18,992
    Likes Received:
    3,533
    Location:
    Winfield, IN USA
    Uhm, could you name another GPU at the time that was even half-way decent at running dx9 shaders? :|
     
  6. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,090
    Likes Received:
    694
    Location:
    O Canada!
    Compiler optimisers should take care of instruction ordering these days.

    Its an automatic speed-up that can increase all performance in graphics limited situations, especially pixel shader limited situations.
     
  7. radar1200gs

    Regular

    Joined:
    Nov 30, 2002
    Messages:
    900
    Likes Received:
    0
    Compiler optimizers can't do a lot if the shader they are handed is stacked against them. I believe compilation of shaders should happen at run-time, as the first thing a game does. This prevents shader stacking from occuring in the first place.
     
  8. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,090
    Likes Received:
    694
    Location:
    O Canada!
    [Good] Complier optimisers can reorder the instructions in the most convenient way according to the architectures requirements regardless of when a game hand the shader to it.
     
  9. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Oh, I have pretty much zero detailed knowledge about HL2's shader usage. I'm just assuming that it's mostly shader-limited, and that the results will be very roughly in line with synthetics we've seen elsewhere.
     
  10. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Sure, but there's more to it than this.

    For example, if the shaders don't use nrm, but do use a sequence of instructions that is commonly used instead of nrm, then this sequence could easily be replaced with the instruction nrm by the driver.

    Additionally, there are performance differences in just normal, complex shaders that we have seen, differences that, as far as I can tell, cannot be attributed to the use of faster partial-precision instructions, but instead must be due to register pressure.

    From what I understand, nVidia successfully reduced the register pressure on the NV40 to the point where it will never fail to execute at least one instruction per clock per pipeline due to register pressure. From what I understand, register pressure, in addition to compiler finesse, determines how many instructions can be executed in a pipeline each clock. So, if this is true, it means that complex shaders (ones that frequently use many different instructions in a short space) will tend to have higher throughput with FP16.
     
  11. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    And if the shader doesn't do this, then it can't.
    And do you know if HL2's shaders fall into this category?
    Aren't people seeing higher performance? You can't reasonably expect every app to be the same.
     
  12. radar1200gs

    Regular

    Joined:
    Nov 30, 2002
    Messages:
    900
    Likes Received:
    0
    The compiler can't add normalisation instruction that don't exist in the source, the compiler can't enable partial precision without local or global hinting.

    None of these measures cost Ati anything, OpenGL Guy (or one of the others) replied to me in these forums once stating that ATi ignores _PP.

    nVidia relies to a great extent on shader code being written efficiently to get the most from the shaders. ATi can afford far more sloppiness by comparison. Coding efficiently does not adversely affect ATi's performance (in fact it helps to improve it even more).

    The only way for nVidia to fix Valves sloppiness short of forcing global _PP on all shaders is to rewrite the shaders themselves, to be more efficient. Of course the instant they attempt that, the entire internet will jump down their necks, accusing them of evil, IQ degrading optimizations.
     
  13. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    And the app can't add math just for your sake.
    24 bit precision at all times is a good thing.
    What are you talking about? Of course efficient coding makes a difference, if it didn't we wouldn't use an optimizer.
    If image quality were degraded, you wouldn't be upset?
     
  14. radar1200gs

    Regular

    Joined:
    Nov 30, 2002
    Messages:
    900
    Likes Received:
    0
    Because ATi can execute shaders faster than nVidia, they can afford more sloppiness in a shader before they are slowed down, and, can use this sloppiness as a weapon against the competition.

    I never said there would be an IQ reduction, I said, the accusation would be that there is an IQ reduction.
     
  15. jvd

    jvd
    Banned

    Joined:
    Feb 13, 2002
    Messages:
    12,724
    Likes Received:
    9
    Location:
    new jersey
    This hack reduces image quality , pictures of it have been posted.

    This should not be included as a driver hack from nvidia or an option from valve , the game lucks fugly on my 5800ultra with this hack. and for the upwards if 4 fps gain i'm seeing jsut isn't worth it .
     
  16. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    18,992
    Likes Received:
    3,533
    Location:
    Winfield, IN USA
    Methinks you're reaching a bit. :lol:
     
  17. radar1200gs

    Regular

    Joined:
    Nov 30, 2002
    Messages:
    900
    Likes Received:
    0
    Of course, forcing total _PP on everything isn't always desireable.

    That can be seen in the HL2 glass and the 3dmark03 sky.

    That's what full precsion is there for and why _PP hints should be in place in the shaders, so that things that require full precision actually get it.
     
  18. radar1200gs

    Regular

    Joined:
    Nov 30, 2002
    Messages:
    900
    Likes Received:
    0
    Really, I don't think many people would call the DX9 shaders found in TR: AOD, 3dmark03, HL2 particuarly well written.
     
  19. jvd

    jvd
    Banned

    Joined:
    Feb 13, 2002
    Messages:
    12,724
    Likes Received:
    9
    Location:
    new jersey
    I'm sure if Tr aod and 3dmark 03 was tuned directly for ati hardware (And to a lesser extent hl2) you'd see ati cards performing even better than they are .

    The r3x0 tech is better in full percision dx 9 than the geforce fx .

    The r3x0 tech is still faster than when the fx is in pp all the time

    Crappy code helps no one . Its just that ati's hardware is a great piece of hardware esp compared to the geforce fx that its really hard to make shaders that let the fx line look better than the r3x0 line .
     
  20. radar1200gs

    Regular

    Joined:
    Nov 30, 2002
    Messages:
    900
    Likes Received:
    0
    Your argument is silly.

    It doesn't matter if ATi is faster than nVidia or not. _PP should still be used to help nVidia and others perform as fast as they are capable of.

    Its just like saying that Porches are faster than family cars so, tyre companies will only produce tyres for Porches.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...