NVIDIA Fermi: Architecture discussion

Discussion in 'Architecture and Products' started by Rys, Sep 30, 2009.

  1. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    I've just finished to read RWT's preview. I previously thought that they got fully coherent L1 caches per SM. Wishful thinking..
     
  2. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,428
    Likes Received:
    425
    Location:
    New York
    Can higher precision cause rendering errors? :huh:
     
  3. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,744
    Likes Received:
    469
    What would it really get you any way, finegrained communication should be done with messages, coarse grained communication can be done through L2 (latency is not an issue for coarse grained communication, and it has plenty of bandwidth). Complete coherency is a waste of transistors :p
     
  4. babcat

    Regular

    Joined:
    Sep 24, 2006
    Messages:
    656
    Likes Received:
    45
    Has NVIDIA posted any information on GRAPHICS related improvements?
     
  5. digitalwanderer

    digitalwanderer Dangerously Mirthful
    Legend

    Joined:
    Feb 19, 2002
    Messages:
    17,217
    Likes Received:
    1,736
    Location:
    Winfield, IN USA
  6. Arty

    Arty KEPLER
    Veteran

    Joined:
    Jun 16, 2005
    Messages:
    1,906
    Likes Received:
    55
    In layman terms;

    R600 - 320SP - 720 million - 2.25 MT per SP (Added D3D10)
    RV670 - 320SP - 666 million - 2.08 MT per SP (Added D3D10.1 & lowered memory bus width)
    RV770 - 800SP - 956 million - 1.19 MT per SP
    Cypress - 1600SP - 2.15 billion - 1.34 MT per SP (Added D3D11)

    G80 - 128SP - 686 million - 5.36 MT per SP (Added D3D10)
    G200 - 240SP - 1.4 billion - 5.83 MT per SP (Added CUDA)
    Fermi - 512SP - 3 billion - 5.85 MT per SP (Added more CUDA + D3D11 & lowered memory bus width)

    Since we dont have any performance numbers, purely from this perspective it doesnt like Fermi is Nvidia's RV770 but still their jump looks more efficient.
     
  7. babcat

    Regular

    Joined:
    Sep 24, 2006
    Messages:
    656
    Likes Received:
    45
    That is really disappointing. I can't wait until they share more information about what features it has that will benefit graphics.
     
  8. INKster

    Veteran

    Joined:
    Apr 30, 2006
    Messages:
    2,110
    Likes Received:
    30
    Location:
    Io, lava pit number 12
    DX11 is an improvement, no ? :wink:

    And this is a "graphics" card, it has a DVI port (unlike their Tesla GPGPU counterparts):

    [​IMG]

    Since this was never meant to be a product "launch", not even a "paper-launch", talking about that three months before the actual unveiling would sort of kill the "buzz" in the media.
    I'm surprised even an official die photo got in there, since not even AMD treated us with one from the "Cypress"/RV870/HD5870 core.
     
    #68 INKster, Oct 1, 2009
    Last edited by a moderator: Oct 1, 2009
  9. babcat

    Regular

    Joined:
    Sep 24, 2006
    Messages:
    656
    Likes Received:
    45
    DX11 is an improvement, but one that everyone has expected.

    BTW, I'm not suggesting that they reveal frame rates or even images/videos from games. I'm just wondering if the hardware has been modified or any hardware has been added that will benefit graphics. For example, there is the rumor that it does not have a hardware tessellation unit. That is something that could impact graphics.
     
  10. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    At this point, I think fixed functional graphics features are essentially maxed out. Yes, they are marginal improvements to be made to texture filtering or antialiasing, but its high cost for small gain. The future of GPU graphics improvements as things have become more general is essentially just upping performance and eliminating pathological use cases that might choke performance.

    They might improve AF or AA slightly, but those just eliminate annoying artifacts from fewer and fewer edge cases, they won't improve lighting.

    We've kind of reached a 'boring' stage in GPU evolution where introduction of new fixed-function blocks may recede. I'm not sure what the long term survival will be for fixed-function tessellation for example (despite Charlie's aggressive shilling for it). Geometry amplification is a wide-open field that encompasses compression and procedural synthesis.

    From a developer standpoint, this card looks sweet, from exceptions, to vtable-dispatch support, to debugging in a real IDE stepping through C++ on the GPU, but it comes at a cost, which is density. NVidia may be following SGI's eventual decline, by losing the consumer/workstation market based on cost and targeting HPC, which is a niche.
     
  11. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    Whether tessellation is done in hardware or software won't affect graphics quality, only performance, and that's if your bottlenecked by tessellation and if your desired tessellation fits the hardware.
     
  12. babcat

    Regular

    Joined:
    Sep 24, 2006
    Messages:
    656
    Likes Received:
    45
    How could exceptions, vtable dispatch support, debugging in a real IDE, and so fourth impact graphics?

    Also, I read somewhere that ray tracing could be improved in this GPU. Do you see anything that could indicate this?
     
  13. INKster

    Veteran

    Joined:
    Apr 30, 2006
    Messages:
    2,110
    Likes Received:
    30
    Location:
    Io, lava pit number 12
    Pure software emulation is certainly not on their minds if it significantly impacts actual 3D performance, but it is conceivable that it's one area where there'll be compromises as to the level of hardware implementation, much like the ones Nvidia did it with G80 when Geometry Shaders' performance (part of the DX10.x spec) took a back seat to the rest.
    This time, with this much programmability on-chip (assuming for a moment the scenario described above), the quality of the OS driver and compiler is paramount, and could change the tide for a particular game pretty dramatically.
     
  14. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    14,848
    Likes Received:
    2,264
    So to those who understand, this new chip is it
    Yay, Meh or Wtf ?
     
  15. MarkoIt

    Regular

    Joined:
    Mar 1, 2007
    Messages:
    392
    Likes Received:
    0
    I don't understand, but i think it's :shock: for gpgpu computing.. and :sad: for gaming (in terms of performance/mm^2 against Cypress).
    But we will see..
    For sure, it's a nice step forward from GT200's architecture.. but where Nvidia is stepping? :D
     
  16. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    This seems to be a new FUD meme, that this card won't run games fast. With the exception of special hardware for ECC and double precision, practically everything else will help games, if you accept the messaging that AMD/ATI was selling since the R600, which was that ALU:TEX ratios are going up and shader power is therefore a big factor in gaming.

    What may not be as important, is the stuff for C++, since you can always write non-OO code and avoid indirect dispatch (with a cost). That's extra trannies for development convenience.

    The lack of information about TMUs/ROPs doesn't mean they suddenly gimped the card. They wouldn't have included a 384-bit memory bus if they were going to gimp on texturing or fillrate. I would wait and see. Then again, AMD had a massive 512-bit bus on the R520, but totally skimped out on TMUs.
     
  17. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,699
    Likes Received:
    117
    Yeah, based on the information available, I'd estimate it will be at least ~45% faster than HD5870 which seems reasonable for 40% greater die size.
     
  18. FUDie

    Regular

    Joined:
    Sep 25, 2002
    Messages:
    581
    Likes Received:
    34
    R520 had a 256-bit bus. R600 is the chip you're looking for.

    -FUDie
     
  19. seahawk

    Regular

    Joined:
    May 18, 2004
    Messages:
    511
    Likes Received:
    141
    That was tech day, not aimed at the gaming consumer. For those with an interest in what they showed it was an awesome day, for the others ... well wait till the gamer launch comes.
     
  20. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    Yeah, my memory is starting to get faulty over these years.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...