Official HardOCP HL2 benchmarks

Discussion in '3D Hardware, Software & Output Devices' started by Ruined, Nov 17, 2004.

  1. ballero

    Newcomer

    Joined:
    Sep 29, 2003
    Messages:
    21
    Likes Received:
    0
    "optimize"??
    Where is the mixed-mode?
     
  2. gb25

    Newcomer

    Joined:
    May 27, 2003
    Messages:
    52
    Likes Received:
    2
    I don't know, and I don't really care. Not only do I not even have hl2, I don't have a geforce card so as to give a damn about mixed mode. Although, after seeing this this
    page, more specifically the first chart, I can kind of understand if there is no mixed mode (not saying there is, or isn't, because I don't know). Because even at dx8, a ti4600 is able to outpace all nv3x cards except the 59xx. So, why bother with mixed mode when the only fx that's capable of playable framerates is 59xx cards. Better to just "treat nv3x as dx8 hardware."

    Anyways, I'm just going on what's in that article. I don't know how fast the fx's run the final release of hl2 with current drivers, and I don't really care. Although I'm sure it's probably quite a bit better than what's represented in that dated hardocp article. All I know is that if I did want to play hl2, I couldn't care less whether there was mixed mode or not, because my 9800 is more than fast enough without it. Too bad that it's something that fx owners have to be concerned about though.
     
  3. Pete

    Pete Moderate Nuisance
    Moderator Legend

    Joined:
    Feb 7, 2002
    Messages:
    5,777
    Likes Received:
    1,814
    Well, I'm almost positive Dave has mentioned this in his various reviews, but I believe 3DCenter's recently translated NV40 article covers the current knowledge. IIRC, it also includes comparisons to R3x0/R4x0. This may also prove informative.

    I don't follow. IIRC, Gary Taggart said in a recent interview that most pixel shaders are partial precision.

    If you're talking about a special NV3x path, Anandtech's HL2 preview noted it. Did Valve drop it as Id did with their NV3x mode, b/c nV transferred their optimizations to their drivers (or simply improved general speed)? I guess so.

    Valve's ATI "Shader Day" comments that they spent (IIRC) 5x as much time optimizing for NV3x, combined with the DX"8.5" mode, shows Valve did at least some compromising for the FX series.

    But I don't think mixed mode changes the fact that a four-pipe 5900 acts closer to a four-pipe 9600 than an eight-pipe 9800 in most DX9 games. It doesn't seem to be as much a matter of precision as of shader ALUs.
     
  4. jvd

    jvd
    Banned

    Joined:
    Feb 13, 2002
    Messages:
    12,724
    Likes Received:
    9
    Location:
    new jersey
    what i find interesting is the speed difrences with the 9800xt vs the 6600gt in the tests trinboy posted.

    The 9800xts are still alot faster in it , if the n40 sieres were indeed faster than the r420 sieres then surely they are also faster than the r3x0 sieres . Not to mention the fillrate diffrences in favor of the 6600gt.

    Anyone know whats going on here
     
  5. hovz

    Regular

    Joined:
    May 10, 2004
    Messages:
    920
    Likes Received:
    0
    could be bandwidth, or vertex limits
     
  6. ChrisRay

    ChrisRay <span style="color: rgb(124, 197, 0)">R.I.P. 1983-
    Veteran

    Joined:
    Nov 25, 2002
    Messages:
    2,234
    Likes Received:
    26
    The vertex capability between the 6600GT and 9800XT is pretty small.
     
  7. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,090
    Likes Received:
    694
    Location:
    O Canada!
    R3x0 architectures have a main ALU that contains the DX9 instructions and a secondary ALU that contain PS1.4 modifiers and "some other instructions that we won't say"; it also has a separate texture address precessor so texture instructions/lookups can be interleved with no performance loss (when running numerous other shader instructions). NV40 has "two ALU's" however these do not both contain a full instruction set - its more of a distribution of instructions between the two and there is, IIRC, just one instruction that is in both (MAD or MULL, I forget which); the first ALU also deals with the texture address instructons.

    However, R3x0 performance drop offs are much more linear when you start getting more into the capabilities - NV40 does appear to have some similarities with NV3x in its performance characteristics, although not to the same extent.
     
  8. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    A dedicated texture addressing processor sounds like a good idea - why didn't Nvidia mimic ATI's approach for the NV40? Or are both approaches balanced with performance being significantly dependent on which one is coded for?

    There are several people that claim ATI goes for a straightforward DX9 implementation while developers need to follow specific paths to get the most out of Nvidia hardware.
     
  9. dan2097

    Regular

    Joined:
    May 23, 2003
    Messages:
    323
    Likes Received:
    0
    Dave is it incorrect to say that pixel shading benchmarks would tend to show the nv40 in a better light than real game benchmarks due to the second alu not having to act as a texture address processor much in a synthetic pixel shader test?
     
  10. Skinner

    Regular

    Joined:
    Sep 13, 2003
    Messages:
    878
    Likes Received:
    12
    Location:
    Zwijndrecht/Rotterdam, Netherlands and Phobos
    First 5 benchmarks are timedemos from anandtech's HL2 performance review

    http://www.anandtech.com/video/showdoc.aspx?i=2278&amp;p=2

    and the last 2 from Hardopc's.

    http://www.hardocp.com/article.html?art=NjkyLDI=

    Cat. beta 4.12 for the X800XT PE. One note, my XFX6800U runs at 425/600 mhz default and I forgot to use the +r_fastzreject 1 command for the X800, I don't know how much of a difference this would make.
     
  11. CJ

    CJ
    Regular

    Joined:
    Apr 28, 2004
    Messages:
    816
    Likes Received:
    40
    Location:
    MSI Europe HQ
    The lack of AA in his benchmarks might also explain the difference... at least for a small part.
     
  12. hovz

    Regular

    Joined:
    May 10, 2004
    Messages:
    920
    Likes Received:
    0
    dont think that command makes much of a difference, maybe a few fps increase on average.

    dave is there any way you can elaborate on the instructions that ati wont say?
     
  13. Demirug

    Veteran

    Joined:
    Dec 8, 2002
    Messages:
    1,326
    Likes Received:
    69
    First ALU is MUL. Second is MAD

    Yes, the share the problem that texture latency can have an bad impact on calculation power usage. This is a result of the design for a "no dependent read limit" chip. Maybe to many 3dfx rampage influence. There is a better solution for that problem but you have to break with traditional GPU design.
     
  14. Demirug

    Veteran

    Joined:
    Dec 8, 2002
    Messages:
    1,326
    Likes Received:
    69
    "Mixed Mode" is removed in the final version.

    Tests with the VST shows that NV35 can run up to 50% faster with a custom FP16 path than with the default FP32 path.

    Have to run this test with some real game demos again.
     
  15. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    18,992
    Likes Received:
    3,532
    Location:
    Winfield, IN USA
    Why is it mad? Because it got picked second? :|
     
  16. Demirug

    Veteran

    Joined:
    Dec 8, 2002
    Messages:
    1,326
    Likes Received:
    69
    The idea behin the nvidia approach is to have one processor that can use for both jobs. This is less sensitively in case of instruction balancing

    DX9 SM2 is primarily written to match R300 shader abilities. Therefore it is not a miracle that it fits well.

    But the current shadercompiler in the nvidia driver does a good job translating SM2 shaders. But it work better with some hints in the shadercode.
     
  17. Demirug

    Veteran

    Joined:
    Dec 8, 2002
    Messages:
    1,326
    Likes Received:
    69
    MAD is Multiply Add. (A*B+C)

    A MAD ALU can do Mul, Add, Dot-products.

    Because the first is only MUL the need a MAD unit to do the Adds and dot products.
     
  18. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    18,992
    Likes Received:
    3,532
    Location:
    Winfield, IN USA
    Thanks Demirug. I was just going for a cheap laugh, but I ended up actually learning something. :)
     
  19. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    But do both implementations still rely on specific coding methods (at the application level) for optimal performance or is one more generic/balanced? Or is it up to the driver/compiler?

    In light of that should we expect even more similarity between the IHVs offerings when ATI is forced to go SM3.0? Or will ATI forever be the shader gurus and leave Nvidia to the driver meddling and application optimizations?
     
  20. Demirug

    Veteran

    Joined:
    Dec 8, 2002
    Messages:
    1,326
    Likes Received:
    69
    If you aim for 100% efficiency you have to write specific code for each chip. Because every chips like or dislike other things. But in normal case you try to find one solution for the desired effect that works good with all chips. Using a high level shader language and a compiler can still give you the possibility for some chip specific codeing. One example is using different code for different kind of normalmap compression (3Dc, DXT, ...)

    In the case of SM3 the spec was written to a large extent before somebody build hardware for it. This give us a different base situation.

    On the other hand even ATI need a shadercompiler in the driver to get good performance. IMHO this will not change with a SM3 solution from ATI. Next gen chips (WGF) will depend much more on a good driver.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...