Performance hit for 64bit and 128bit rendering ?

Discussion in 'General 3D Technology' started by BRiT, Sep 26, 2002.

  1. Hyp-X

    Hyp-X Irregular
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,170
    Likes Received:
    5
    No, it doesn't.
    But it does have 32bit and 16bit float support, with the clear hint that, while texture address calculations require 32bit, 16bit is sufficient for color.
    It's also clear that 16bit is faster on the nv30, but by how much is remains to be seen.

    I'm certain that they'll use the 16bit path for executing pre2.0 pixel shaders, for performance.
     
  2. Hyp-X

    Hyp-X Irregular
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,170
    Likes Received:
    5
    No it's not a typo.

    They define a new type of rendertarget: GL_TEXTURE_RECTANGLE_NV
    It's the only one that's supported.
    And it is not compatible with GL_TEXTURE2D

    IIRC, texture rectangle was defined on GF3 for its (rather crippled) non-pow2 texture support.
    Microsoft crippled the DX8 definition for not allowing mip-mapping, even though there are other cards that support it.
    In DX8.1 they further specified that the feature doesn't work with DXTC after discovering that the GF3 doesn't support it...

    IMHO cubemap support would be the most important thing.
    Ok, it could be done with texture address calculation in the PS, but I doubt it will have comparable speed with a hardware implementation.
     
  3. Basic

    Regular

    Joined:
    Feb 8, 2002
    Messages:
    846
    Likes Received:
    13
    Location:
    Linköping, Sweden
    While GL_TEXTURE_RECTANGLE_NV is a two-dimensional texture, it's still considered a different texture target than the usual 2D textures.

    Edit:
    Hyp-X beat me to it.
     
  4. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    When he said "No... 2d...texture target" I thought he was saying none at all.

    As for 1D/3D, you could always render a 2D texture and use pixel shaders to treat it as 1D or 3D.
     
  5. fresh

    Newcomer

    Joined:
    Mar 5, 2002
    Messages:
    141
    Likes Received:
    0
    That's gonna be such a pain in the ass. Especially the lack of cubemap support. Thank God the 9700 supports it.
     
  6. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    Going from R1 -> R2 or R3 -> R2 is trivial, not really a pain. I don't see why people are fretting over this. Besides pre-computed lookups, how many games are going to render-to-texture into a floating point cube map? Just doing that once would kill your fillrate and bandwidth.


    It's not really a good thing that the R300 supports it if the NV30 doesn't, since if people can code around it, they will, rather than supporting an obscure R300 feature directly. Developers are going to support the lowest common DX9 denominator.

    You've got to come up with a really compelling use case before you can start worrying about it.
     
  7. Grall

    Grall Invisible Member
    Legend

    Joined:
    Apr 14, 2002
    Messages:
    10,801
    Likes Received:
    2,176
    Location:
    La-la land
    This is so weird... Suddenly it's a GOOD thing that NV30 lacks a feature. :eek: You'd expect it would be better to HAVE a feature than NOT have it, no matter how unlikely you think it is people will actually use it.

    I'm pretty sure people wouldn't mount such a completely compelling case if the roles were reversed. Not accusing Democoder of Nvidia bias, but it strikes me as a fairly twisted line of reasoning...

    *G*
     
  8. Nagorak

    Regular

    Joined:
    Jun 20, 2002
    Messages:
    854
    Likes Received:
    0
    Hmmm...last time I checked the R9700 was first to market. Although any feature may not receive widespread support unless it works across all DX9 hardware, I think you worded the above in reverse.

    It's not a good thing that NV30 doesn't support this feature (rather than it being bad that ATI supported it??? :eek:)

    Anyway it probably won't make too much difference.
     
  9. Kristof

    Regular Alpha

    Joined:
    Jan 30, 2002
    Messages:
    733
    Likes Received:
    1
    Location:
    Abbots Langley
    The API does make a distinction between texture addressing and arithmetic operations (don't call them color ops since you might not be working with colors). The API contains pure maths instructions like add, mul, dp3, pow, etc... these are arithmetic and have no relation to textures. You have specific texture sampling instructions that say : go and take this texture coordinate and this LOD and this texture and deliver me the filtered result in register X. Such an instruction is a texture addressing instruction. I assume what R300 has is an Arithmetic Unit and a Texture Sampling Unit, both work in parallell and as such you can do a pure maths instruction and a pure texture sampling command in the same clock. So you can issue a fetch texture sample command and a maths command at the same time. Examples of texture addressing instructions would be tex, texld, texbem, texcrd, etc...

    The issue with NVIDIAs 16 and 32 bit modes is that 16bit might not be enough and 32bit might be too much. Question can we compare benchmarks run at 16 or 32 bit mode with the results of the R300 which runs all at 24 bit ? If we compare 16 with 24 then NV lacks accuracy (but probably has plenty of speed), if we compare 32 with 24 then ATI lacks accuracy and NV probably lacks speed. I can see another 16 versus 32 war and both parties will partially be right... it will be apples versus oranges from the start. Those who think 16 bit or even 24 bit is good enough will probabaly be dissapointed... its just so easy to blow up accuracy :(

    K-
     
  10. Nagorak

    Regular

    Joined:
    Jun 20, 2002
    Messages:
    854
    Likes Received:
    0
     
  11. Kristof

    Regular Alpha

    Joined:
    Jan 30, 2002
    Messages:
    733
    Likes Received:
    1
    Location:
    Abbots Langley
    The difference will be 8 bits both ways. Say B3D benchmarks NV30 16bit mode versus R9700 using 24 bits and publishes the results. ATI will be unhappy indicating that their results have 8 bits more accuracy and that the comparison is invalid. If B3D benchmarks NV30 32bit mode versus R9700 using 24 bits and publishes the results then NVIDIA will be unhappy calling the results unfair since they have 8 bits more accuracy... Do you see the huge problem thats going to appear ? Actually this will only be an issue if R9700 is faster than NV30 in 32 bit mode but slower when NV30 is in 16 bit mode, so ATI beats NV30 in 32 bit mode but loses in 16 bit mode. If R9700 matches NV30 in 16 bit then ATI has won this battle, if R9700 matches NV30 in 32 bit mode then NVidea has won since 16 bit is expected to be "faster". Problem is if they are roughly the same speed...

    Remember the old days where is was 22bit equivalent versus 24 bit ? :) I can see the discussion : "I can't see the difference between 24 bit and 16 bit floats in game X...".

    K-
     
  12. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    Yep, it's coming. Put on your flame suits.
     
  13. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    955
    Likes Received:
    52
    Location:
    LA, California
    Kristof,

    Thanks for the info. Since this is the case, I would be extremely surprised if you couldn't co-issue arithmetic and texture ops on NV30.

    A texture unit should probably be able to support a fairly large number of "in-flight" loads to hide latency, so not being able to run arithmetic ops while a texture op is executing seems like it would lead to a serious performance handicap...

    Regards,
    Serge
     
  14. LeStoffer

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,262
    Likes Received:
    22
    Location:
    Land of the 25% VAT
    Okay then, I need some help here:

    If R9700 is built to do one arithmetic ops per cycle I would assume that this ops would be done in one cycle regardless of whether we're rendering in 16 or 24 bit if textures etc is the same and we only need to write the final pixel to the framebuffer. Right?

    So unless NV30 can do two arithmetic ops per cycle in 16 bit rendering, I don't understand the difference in performance.

    Likewise: If NV30 can do one arithmetic ops per cycle in 32 bit (with R9700 doing the same i 24 bit) the performance shold be the same - everything else being equal of course.

    Sorry for being confused, but I don't understand how specs on one arithmetic (color) ops per cycle would chance internal in the rendering pixel pipeline just because you shift rendering depth (but keep texture/framebuffer bandwidth requirements the same).
     
  15. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    It's not rendering depth you're shifting, it's pipeline precision.

    e.g.

    float x = 1.0f + 2.0f;

    vs

    double x = 1.0 + 2.0;

    Like a CPU, the NV30 can supposedly run the lower precision ops faster.
     
  16. fresh

    Newcomer

    Joined:
    Mar 5, 2002
    Messages:
    141
    Likes Received:
    0
    Sure I'm not going to render into an fp cube map, but I'd like to be able to use a precomputed HDR cube map without jumping through hoops. It's annoying, that's all.

    Just like the PS2 doesn't have backface culling. Writing the code for it is easy, but it's a hell of a lot more convenient to have it in hardware.
     
  17. multigl2

    Newcomer

    Joined:
    May 23, 2002
    Messages:
    64
    Likes Received:
    0
    not if DX9 games are as fast coming as DX8 games have been :lol: :lol: :lol:
     
  18. Hyp-X

    Hyp-X Irregular
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,170
    Likes Received:
    5
    Well don't except DX9 games until cheap (<=100$) DX9 cards are available from all major manufacturers...
    I leave up to you to guess when it will come.
    But it won't be next year.

    On the other hand I'm pretty sure nv30 will use the 16bit path to run <=DX8.1 programs (or <PS2.0 shaders to be more specific).

    I think the only game that it'll be an advantage is Doom3.
    Other games don't use shader programs long enough to make a difference.
     
  19. demalion

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,024
    Likes Received:
    1
    Location:
    CT
    ? I thought it was pretty clear that by this time next year there would be sub $100 DX 9 parts...?

    nVidia has made claims hinting at this, and ATi has stated it pretty strongly with their aggressive roadmap. I'm even pretty sure that that exact aim has been put forth by ATi as a goal to accomplish...I mean a 9500 non pro is likely to be selling for < $180 within the year, isnit it?
     
  20. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,511
    Likes Received:
    24,411
    Just wanted to revisit this topic and make note that in under 3 months, there are indeed 9500 non-pro cards selling for under $160 with 9500 pro cards for under $180. Hats off to ATI for this achievement!
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...