Futuremarks technical response

Discussion in 'Architecture and Products' started by Harlequin, Feb 15, 2003.

  1. just me

    Newcomer

    Joined:
    Feb 15, 2003
    Messages:
    135
    Likes Received:
    0
  2. Ichneumon

    Regular

    Joined:
    Feb 3, 2002
    Messages:
    414
    Likes Received:
    1
    It's all good. :)
     
  3. Dave H

    Regular

    Joined:
    Jan 21, 2003
    Messages:
    564
    Likes Received:
    0
    Actually...no. :) But it's easy to come to that conclusion if you don't understand the difference between PS 1.4 and PS 1.1.

    What is the difference between them? It's not that PS 1.4 is "faster" or "more powerful" in the sense of getting more done per instruction and therefore requiring fewer cycles in the pixel shader to achieve the same effect. Rather it's that PS 1.4 programs can be significantly longer, so that a certain effect which takes 2 (or sometimes 3) PS 1.1 programs can all be done in one PS 1.4 program.

    Now, this doesn't reduce the workload for the pixel shader: the 1 PS 1.4 program that does the work of 2 PS 1.1 programs also takes (roughly) twice as long as each PS 1.1 program. Instead, it reduces the workload on the geometry engine (including vertex shaders), and reduces bandwidth utilization.

    To see why, take a look at what happens when you perform the effect using 2 PS 1.1 programs instead, let's call then program A and program B. When it comes time for the GPU to render a poly to which the effect is applied, it will fetch the vertex coordinates, transform it, run any vertex shader programs to adjust those vertices, light it, and then, for each pixel in the interior of the polygon, run program A and write the result to the framebuffer. Then it will go on rendering all the other polys in the scene until it is done. Then it will start a second pass, and for any polys that are not finished rendering--like this one, which still needs program B to be applied--it will have to repeat the process again: fetch the vertices, run vertex shaders, and render again, this time running program B on the results from program A, and finally writing these final values out to the framebuffer.

    With PS 1.4, you can do the effect with a single program. So you save the task of reading in the geometry again, running any vertex programs (including T&L) again, reading the temp values from the framebuffer and writing the new values to the framebuffer. Same amount of work in the pixel shaders, much less work in the rest of the GPU.

    So in some sense, Nvidia is right to complain about vertex shaders in that GF4's inability to use the one-pass PS 1.4 rendering path does indeed increase its vertex shading workload. Of course, they're completely wrong and very disingenuous to imply that Futuremark could have used a different PS level instead: it's impossible to implement bump-mapped per-pixel specular and diffuse lighting in a single pass in PS 1.1, 1.2 or 1.3. About the best you can do is what Carmack has done in the Doom3 engine: 1 pass in PS 1.4, 2 passes in PS 1.1, and 5 passes for a DX7 GPU that only has fixed-function pixel pipelines. The rendering style used in GT2 and GT3 is a bit more complex: it takes 1 pass in PS 1.4, 3 in PS 1.1, and cannot be done at all on a DX7 card (presumably; or perhaps FM just didn't bother coding a fallback because the performance would be so absurdly bad).

    So is Nvidia right when they assert "This approach creates such a serious bottleneck in the vertex portion of the graphics pipeline that the remainder of the graphics engine (texturing, pixel programs, raster operations, etc.) never gets an opportunity to stretch its legs"? No, absolutely not.

    As Futuremark suggests, this fact is easily seen by looking at the scaling factors on the various cards as the resolution is increased. Luckily the Tech Report review has the data we need. Now, if the only bottleneck on rendering this scene was pixel throughput (in this case, the pixel shaders), then you would expect the scores to scale linearly with the number of pixels onscreen; i.e., you would expect the score @1280*1024 to be exactly .6x the score @1024*768, and the score @1600*1200 to be exactly .4096x the score @1024*768. If, on the other hand, the only bottleneck, even at 1600*1200, was the vertex shader workload, then lowering the resolution wouldn't change the fps one bit. Similarly, if the vertex shaders are the bottleneck at 1024*768, increasing the resolution won't lower the fps until pixel shading becomes enough of a burden to shift the bottleneck away from vertex shading, and indeed the drop in performance at higher resolutions will be very slight.

    So we can use those results at from TR to tell us by how far these GPUs deviate from perfect "100% pixel throughput bottleneck" on GT2. Each percentage represents the amount by which the actual score is faster than the theoretical score assuming linear scaling with resolution, using the 1024*768 scores as the base:

    9700 Pro:
    1280*1024 - 13.46%
    1600*1200 - 23.31%

    9500 Pro:
    1280*1024 - 9.55%
    1600*1200 - 14.63%

    GF4 Ti4600:
    1280*1024 - 13.76%
    1600*1200 - 22.07%

    The GF4 is just as pixel-limited as the 9700 Pro! It is slightly less so than the 9500 Pro, but nothing to complain about. And let me remind you that the entire effect we're measuring is going to be much more significant than the portion of it due only to the extra vertex skinning. Conversely, if you look at the graphs in any reviews here at B3D--in which the x-axis represents pixel throughput rather than fps, in order to facilitate exactly this sort of comparison--you'll notice that GT2 is much more pixel-limited than many 3d games on the market (which should be expected because a game is more likely to be CPU-limited than a synthetic 3d benchmark, 3DMark01 notwithstanding).

    Another demonstration of the same result comes from this comparison of a 9700 Pro with and without PS 1.4 support disabled in the drivers. The GT2 result increases by 22.5% when PS 1.4 is enabled; again, only a portion of that is due to any drop in vertex shader workload (probably more is due to the drop in required bandwidth), and only a portion of that portion is due to vertex skinning (as opposed to T&L).
     
  4. just me

    Newcomer

    Joined:
    Feb 15, 2003
    Messages:
    135
    Likes Received:
    0
    Dave H,

    Thank you so very much for explaining that. I even understood 90% in the 1st read thru. :D I'll be reading it many more times too.

    Hmmmm. <Trying to put all that has happened into perspective again>

    Thanks again Dave H. & I'll definately keep an eye open for your posts. 8) 8)

    just me
     
  5. Barnes

    Newcomer

    Joined:
    Jun 29, 2002
    Messages:
    4
    Likes Received:
    0
    Dave H,

    To those of us who know very little about 3d technology, your post was very illuminating. Some of us lurkers appreciate clear explanations…
     
  6. Katsa

    Newcomer

    Joined:
    Feb 12, 2003
    Messages:
    11
    Likes Received:
    0
    Someone more familiar with GameCube can probably verify this, but isn't GameCube also close to "having PS1.4 support"?

    If true, wouldn't that imply that many gamecube developers would have PS1.4 support in the PC versions of the games (though dunno how many come across that port line, may not be many)

    As someone brought the 1.1/1.3 & XBOX argument up in this fight...
     
  7. Evildeus

    Veteran

    Joined:
    May 24, 2002
    Messages:
    2,657
    Likes Received:
    2
    What did you say?

    [H]
     
  8. Joe DeFuria

    Legend

    Joined:
    Feb 6, 2002
    Messages:
    5,994
    Likes Received:
    71
    He was referring to my statement about hoping that Tom's Hardware and [H] doing an article on responses from Future Mark. I said:

    I don't see any indication at all that [H] is going to consider FutureMark's (or ATI's) responses, and use them to test the validity of its current opinion. I'm not saying they won't, but there's no indication that they will. And that's what we want to see.

    Between FutureMark, and ATI, they address pretty much every issue that [H] and nVidia raised.

    Is it to [H]'s satisfaction? Why or why not?
     
  9. Evildeus

    Veteran

    Joined:
    May 24, 2002
    Messages:
    2,657
    Likes Received:
    2
     
  10. Zeross

    Regular

    Joined:
    Jun 3, 2002
    Messages:
    289
    Likes Received:
    26
    Location:
    France
    No, Flipper (GameCube's GPU) have something known as TEV which is a sort of programmable fragment pipeline (allowing dependant texture reads and ALU ops on fragments) but has nothing to do with DirectX shaders.
     
  11. tb

    tb
    Newcomer

    Joined:
    Feb 7, 2002
    Messages:
    241
    Likes Received:
    0
    Location:
    Germany / Thuringia
    No.

    All game tests are heavy vertex shader tests. Test 1 is only limited by the vertex shader. Test 2,3 and 4 are vs limited in low resolutions, but at higher resolutions they are limited by fillrate. Pixel Shader performance has very low impact on these tests (on the radeon 9700 pro)

    http://www.tommti-systems.com/main-Dateien/reviews/3dmark03/3dmark03.html

    Thomas
     
  12. Ichneumon

    Regular

    Joined:
    Feb 3, 2002
    Messages:
    414
    Likes Received:
    1
    I'm not sure that's what your tests show, but then again, I have a hard time understanding quite a bit on that page (not only because I don't understand German that well).
     
  13. THe_KELRaTH

    Regular

    Joined:
    Dec 9, 2002
    Messages:
    471
    Likes Received:
    0
    Location:
    Surrey Heath UK
    Heh, me neither but I'll take 2 of the 31355 ones!!! :wink:
     
  14. demalion

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,024
    Likes Received:
    1
    Location:
    CT
    "Only"? So the test results don't change as resolution increases? That is counter to the results I've seen. Of course if you reduce vertex shader performance enough, changing resolution won't have an effect, but how is that saying anything new or useful for the comparison?

    Really? The data doesn't look like that to me (though, perhaps I missed something in the translation?). If you lower the resolution enough, of course they won't be as fillrate or pixel shader limited, and vertex shading limitations (which haven't changed) have a greater impact, but the very idea that changing resolution changes performance drastically points out that vertex shading is not the most significant limitation of the workload as that situation is changed.
    When Ichy is talking about moving the vertex skinning to the CPU as per nvidia's complaint, he is, AFAICS, rightly pointing out that this reduces the workload on the vertex shading performance, so I assume your "No." isn't to that part of his response?

    BTW, I may have misread, but some of your reasoning based on low resolution testing looks like you are assuming changing resolution doesn't have a corresponding impact on pixel shading workload...?

    Whatever you mean, some test data for a 9500 (non pro) and a 9000 Pro for comparison might yield some information on relative shader performance gains from enhanced functionality in those tests (maybe with Hyper Z features turned off).
     
  15. Evildeus

    Veteran

    Joined:
    May 24, 2002
    Messages:
    2,657
    Likes Received:
    2
  16. tb

    tb
    Newcomer

    Joined:
    Feb 7, 2002
    Messages:
    241
    Likes Received:
    0
    Location:
    Germany / Thuringia
    My No, was to this part of the message "Game Test 2 and 3 don't change much between CPU vs GPU skinning because primarily they are Pixel Shader limited"

    I think they are more vertex shader and fillrate limited than pixel shader calculation speed limited. The limitation goes from the vertex shader to the fillrate (and a little bit of pixel shader) when you increase the resolution. Test 1 is most of the time vertex shader limited, but fillrate comes into play with some very high resolutions. Test 2,3 and 4 are not that heavy limited by the vertex shader. Fillrate is the main limitation in these tests(2,3,4) and the pixel shader has a very less impact.

    Sorry, don't have a radeon 9500 / 9500 pro :(

    Thomas
     
  17. Bjorn

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,775
    Likes Received:
    1
    Location:
    Luleå, Sweden
    The whole thing:

    I must say that i don't really buy this argument. Not fully at least.
    I mean, Ati like other IHV's must have access to most developers "coming tech" engines, Doom3, new Unreal tech and stuff like that. Surely they get reports back from them and rely more on those type of things then on 3D Mark.
     
  18. Joe DeFuria

    Legend

    Joined:
    Feb 6, 2002
    Messages:
    5,994
    Likes Received:
    71
    LIKE I SAID,

    They have not made any mention if they actually plan to comment on ATI's / FutureMark's response or not.

    Again, the point is, [H] made several comments (parroting nVidia's complaints) about 3D Mark. FutureMark and ATI basically ADDRESSED all of those complaints in their rebuttles.

    So the question is...is [h] satisfied with the explanations? And more importantly, if not....WHY NOT.

    Right...and no indication about why they aren't satisfied with that response...assuming they aren't.

    Yeah, here's a quote from [H] (my emphasis added:

    Well, excuse me, but DUH. :roll: Is that supposed to be some sort of revelation by [H]? Everyone has always agreed that by using ANY single benchmark or tool, you can't get the whole story. This goes for 3DMark, Quake, Doom, Serious Sam, Code Creatures, et. all.

    And again, this is no admission from ATI or Futuremark that the score is useless, which is what [H] is claiming.
     
  19. demalion

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,024
    Likes Received:
    1
    Location:
    CT
    "ATI on 3DMark03 : ATI comes to out to be the first graphics card company with an official statement on the benchmark that was released this week."

    So, has HardOCP already admitted publically that they received commentary from nvidia? If so, the above can maybe be taken in context. If not, the above is a pretty gross distortion and abuse of the term "official" to give the impression that nvidia hasn't had a chance to "tell their side" yet.
    That is assuming, to give them the benefit of the doubt, your definition of "official <vendor> response" excludes the nvidia whitepaper.

    Personally, I don't recall having seen any mention that the reasoning presented in the 3dmark03 article reflected an outside source, and it looked to me like it was passed off as HardOCP's independently achieved conclusions. In that light, the above looks like a pretty strong indication of a deep bias, where "HardOCP = nvidia", and "Outside opinion that bears the burden of proof = ATI". The first equation is where I see a very large problem. :-?
     
  20. Brent

    Regular

    Joined:
    Apr 11, 2002
    Messages:
    584
    Likes Received:
    4
    Location:
    Irving, TX
    well, we don't = nvidia

    nvidia made some good points, so did futuremark and ati

    i agree with points from all of them

    i think its important to take in and evaluate what everyone involved has said and draw input from that...
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...