5200 vertex shader(s)?

Discussion in 'Architecture and Products' started by horvendile, Mar 11, 2003.

  1. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Um, I think it's pretty clear by now that all of the NV3x chips have the exact same core programming-side functionality.

    The only question remaining is "fringe" support. That is, do they support the same number of instructions as the NV30? Do they have support for the high-precision log/exp/sin/cos functions? But the NV31 and NV34 certainly are fully-DX9.
     
  2. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    Only if you consider the NV30 fully DX9. Where is MRT support?
     
  3. demalion

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,024
    Likes Received:
    1
    Location:
    CT
    No, this shows us that if there is a CPU/system limited portion of the vertex shading, it is not the bottleneck on the system in question (Pentium 4 2.8 GHz system), for the benchmark in question. Which, considering it is the 3dmark vertex shading test (presumably a complex workload), is still a good thing, assuming nothing funky is going on in the drivers that is specific to that benchmark program. Wouldn't it be nice not to have to wonder with the other things that have gone on, including nvidia's request for a delay of benchmark results? :-?

    BTW, the drivers used were 42.72, which AFAIK is the latest "benchmark driver" release. I'm wondering how other driver releases would compare, and how other VS benchmarks would compare.

    Note, that if you replace "all the information we need" with "a very good indication", I don't disagree at this point.

    EDIT: they weren't VS 2.0 tests
     
  4. Dave H

    Regular

    Joined:
    Jan 21, 2003
    Messages:
    564
    Likes Received:
    0
    I dunno about this. The results scale exactly with clock speed (to the precision given by the benchmark, i.e. .1 fps). It's pretty darn clear.

    Correct, VS 1.1. My bad.
     
  5. overclocked

    Veteran

    Joined:
    Oct 25, 2002
    Messages:
    1,317
    Likes Received:
    6
    Location:
    Sweden
    Well if it´s official from Nvidia it must be true. :wink:

    From the benches i seen the NV34 lags behind the NV31 with so much difference in shader test´s that it seem´s like it well could be a 2*2 not using shader´s and can only work as 2*1 with shader´s.
    That would still be "effective" 4 shaderpipes.
     
  6. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    The packed 128-bit framebuffer offers similar functionality.
     
  7. demalion

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,024
    Likes Received:
    1
    Location:
    CT
    Well, your point of scaling exactly and 0.1 being precise enough raises another issue.

    Let me illustrate with a counter example that is just as (in)valid, depending on your assumptions. I'll present in a template form since with some of the blanks filled in with enough system variations, it should answer our question conclusively (the key is that AFAIK we don't have that data yet, hence all the "?"). Of course, maybe I just missed the data somewhere, but it isn't in the page you linked to.

    Systems:

    TH: Athlon XP 2700+, nforce2, 512MB PC333
    HF: Pentium 4 2.8 GHz, ASUSTeK P4G8X Deluxe, 512MB PC3200

    Drivers:

    TH: "6307", 42.72
    HF: Cat 3.1, 42.72

    3dmark 2001SE VS results
    Code:
        9(2/0)00  9000P    5200U    5600U
    
    TH:  81.5     82.1     58.4     73.0
    HF:   ?        ?        ?        ?
    

    3dmark 03 VS results
    Code:
        9(2/0)00  9000P    5200U    5600U
    
    TH:   3.4      3.7      5.3      6.2
    HF:   3.4      3.7      5.4      6.2
    
    3dmark 03 VS 2.0 results
    Code:
        5200U    5600U
    TH:  7.4     16.4  
    HF:  ?        ?
    
    The 5.3 and 5.4 figures are the ones that address our question, and the difference is just as meaningful (AFAICS) as the lack of deviation from what you expected in what you are proposing is the final word on the issue. Namely, the significance of the numbers needs further context for comparison (and we currently have a lack of information for that context). Note, that it is your insistence that what you link to is "all the info we need" that I am disputing, not the conclusion you are reaching, which seems reasonable barring something unexpected. The problem is we don't (yet) have enough info to rule out something unexpected as being a factor with the data you are focusing on.

    This isn't a big deal one way or the other, we'll get that info eventually, but your comment is of the nature "we don't need to investigate this anymore" which I don't agree with.
     
  8. overclocked

    Veteran

    Joined:
    Oct 25, 2002
    Messages:
    1,317
    Likes Received:
    6
    Location:
    Sweden
    well i i just grab my thought´s as a layman.
    Not going to deep into it now but i should say that i "mean" pixelshader-pipes performance/output being half of the NV31/ and that´s was slowing it down IMO.

    I draw the conclusion that VS is much the same in NV31 and NV34 but i still want to know more.
    As to vertex output i still think it´s weird that Nvidia went out with 350m/v a sec, this is what many looks at when judging performance on the back of the box.
    Now the Ultra has 250m/v a sec and regular 200million.
    I don´t care whatever the spec tells you it´s about performance and quality.
    But i still want to know as all 3D geeks here..
     
  9. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    It's not enough to allow creation of interesting vertex arrays via the pixel shader. x,y,z alone would be 96-bits. See the displacement mapping presentation from GDC using uberbuffers.
     
  10. incandescent

    Newcomer

    Joined:
    Nov 19, 2002
    Messages:
    15
    Likes Received:
    0
    is it possible they removed the integer combiners? :?:
     
  11. rwolf

    rwolf Rock Star
    Regular

    Joined:
    Oct 25, 2002
    Messages:
    968
    Likes Received:
    54
    Location:
    Canada
    Maybe they only support 64-bit color?

    Edit: Scratch that their website says 128-bit color.
     
  12. DeanoC

    DeanoC Trust me, I'm a renderer person!
    Veteran Subscriber

    Joined:
    Feb 6, 2003
    Messages:
    1,469
    Likes Received:
    185
    Location:
    Viking lands
    Under Dx9 NV3x doesn't support ANY high precision render targets let along multiple ones yet! Don't assume that fully Dx9 means anything BUT PS_2_0 or higher.

    We will get D3DFMT_R32F and D3DFMT_R16G16F when the drivers catch up and MET support should appear at DX9.1. But from what I've been told, no 4 channel >8 bit formats (so you can't even store position easily!)

    For now I'm having to split 16 bit integers into 2 8 bit integers pass and combine in a pixel shader when I use them. My 1 pass R300 shader (MRT with 4 D3DFMT_A16B16G16R16 surfaces is looking like 5 passes NV3x (5 D3DFMT_A8R8G8B8 surfaces as I only need to 1 surface to be effectively D3DFMT_A16B16G16R16).

    What NV3x can do under OpenGL bares no relation to what it does under Dx9!
     
  13. Matt

    Newcomer

    Joined:
    Feb 12, 2002
    Messages:
    129
    Likes Received:
    0
    I talked to an engineer, not a PR person. I'm not here to make you believe, I'm just passing on the answer. You guys asked if it was done in hardware, so I went to the source and asked.
     
  14. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,516
    Likes Received:
    24,424
    But did you explicitly ask if it was done in hardware on the GPU/VPU? Even if the CPU does the work, technically it's still done in hardware. ;)
     
  15. Ostsol

    Veteran

    Joined:
    Nov 19, 2002
    Messages:
    1,765
    Likes Received:
    0
    Location:
    Edmonton, Alberta, Canada
    Sounds like TruForm on the R300. . . :roll:
     
  16. UncleSam DL iXBT

    Newcomer

    Joined:
    Feb 27, 2003
    Messages:
    36
    Likes Received:
    0
    NV31/34 has the same vertex CALCULATION performance on same clockspeed. they 2.5 thimes slower than NV30

    but NV34 brobably has twice smaller (or around so) caches/fifo's and another memory controller so somethimes its VS speed is lower.
     
  17. Matt

    Newcomer

    Joined:
    Feb 12, 2002
    Messages:
    129
    Likes Received:
    0
    Yes, I specifically asked if it was done on the GPU (I used GPU, not VPU), and he said yes.
     
  18. Dave H

    Regular

    Joined:
    Jan 21, 2003
    Messages:
    564
    Likes Received:
    0
    demalion-

    I'm not sure I understand exactly what you're getting at... :oops:

    But if your point is that 5200U and 5600U show large clock-normalized performance differences on some VS tasks and not on others, yes, I'm quite aware of that but that doesn't really concern me because we know there are plenty of other differences between NV31 and NV34 that could explain this.

    What I am sure of--and AFAIK the review at hardware.fr is the only one to address this--is that comparisons of 5200 to 5200U, and 5600 to 5600U (i.e. differently clocked versions of the same chip) demonstrably show that VS performance scales linearly with clock rate i.e. is done completely in hardware.
     
  19. demalion

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,024
    Likes Received:
    1
    Location:
    CT
    Well, I thought I made it clear that my issue was with the bolded words (not one or the other set alone, but both together).

    You quoted VS 1.1 benchmarks (in fact, my comment that the benchmarks were a good indication and complex workload was based on it being VS 2.0 before I corrected that in an edit), and one that doesn't seem likely to use all of even VS 1.1 functionality. But we aren't talking about a vertex shader 1.1 benchmark accelerator, we are talking about a vertex shader 2.0 (and not just benchmark) accelerator, which to me leaves two issues:

    1) All we have indication of (in what you propose is the complete picture) in the first place, is that on the CPU in question (a very fast one), the CPU workload for the specific (VS 1.1) benchmark in question is not limiting.

    2) We have no idea how the CPU workload changes for implementing other VS functionality as far as I've seen (as I tried to illustrate, among other things, with my charts). This is other VS 1.1 instructions, register counts, macros, VS 2.0....some pretty significant items (in fact, it would be interesting to see this tested for a whole host of cards at the same time, but reviewers seem to have things like lives and stuff that get in the way of them doing some of the testing I'm curious about :roll: :p ).

    It is your insistence that this is a complete picture that continues to puzzle me, that's all. What if the behavior isn't the same with a 1 GHz Athlon or P III? Is it conceivable that the limitations might be different than for a 2.8 GHz P4? And that is not even the bottom end of the range the bargain cards will address.
    An example that comes to mind is that of watching Quake III scale perfectly with GPU clock speed on a Rage Pro and concluding the graphics speed of the game is determined solely by GPU, in the absence of any benchmarks involving changes in CPU performance, or, as in my counter example, observing a change based solely on CPU performance and ignoring the graphics card and resolution at which this is observed to conclude it is accelerated completely by the CPU.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...