NVIDIA Fermi: Architecture discussion

Discussion in 'Architecture and Products' started by Rys, Sep 30, 2009.

  1. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    A silicon respin could certainly improve upon many performance and efficiency metrics. However, a bigger question, at least for me, is whether there is a reasonable probability of this happening, assuming Fermi 2/Fermi's shrink is due this winter.
     
  2. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,491
    Likes Received:
    909
    There can't be any shrink this winter, since TSMC's 40nm process is the smallest available.
     
  3. NathansFortune

    Regular

    Joined:
    Mar 3, 2009
    Messages:
    559
    Likes Received:
    0
    TSMC and Global Foundries have stated 28nm isn't going to be ready until H2 2011. That means Nvidia need to get more mileage out of their current designs.
     
  4. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    1,319
    Likes Received:
    23
    Location:
    msk.ru/spb.ru
    GF100B (or whatever they'll call it in the end) is what's coming in the Fall. There may be a more or less new (still Fermi-based at it's core) 40G GF100B replacement down the road but its fate will depend on a lot of factors, and I won't be surprised if they'll wait for 28HP for their next top-end GPU.
     
  5. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    Must have missed it. Do you have a recent link?

    Also, it means at best, we can expect a hybrid part this year from AMD.
     
  6. NathansFortune

    Regular

    Joined:
    Mar 3, 2009
    Messages:
    559
    Likes Received:
    0
    I can't find the link, but it was from the TSMC Fab 15 article somewhere. The CEO said 40nm was their concern right now and 28nm is delayed until later in 2011, probably H2. Global Foundries have a similar outlook as 32nm for AMD and ARM is their primary concern.
     
  7. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    How interesting, as People's Republic of China domestic CPU industry (loongson processors) stated they aim for 32nm at end of 2011 :).
     
  8. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,491
    Likes Received:
    909
    As far as I'm aware, this is GlobalFoundries' latest public roadmap:

    [​IMG]

    And I haven't heard of any changes to the 28nm schedule since then.
     
  9. aaronspink

    Veteran

    Joined:
    Jun 20, 2003
    Messages:
    2,641
    Likes Received:
    64
    They've stated a lot of things over time and they contract out the fabrication to non Chinese companies.
     
  10. Man from Atlantis

    Regular

    Joined:
    Jul 31, 2010
    Messages:
    732
    Likes Received:
    6
    GF104, GF100 Core Architecture Comparison
    http://news.mydrivers.com/Img/20100730/02501268.jpg
    GF104 SM architecture (part of the speculation)
    http://news.mydrivers.com/Img/20100730/02503995.jpg
    GF100 SM architecture
    http://news.mydrivers.com/Img/20100730/02504021.jpg
    NVIDIA graphics core in recent years, the evolution diagram
    http://news.mydrivers.com/Img/20100730/02521912.jpg
    G80, GT200, GF100, GF104 contrast the core memory and multithreading
    http://news.mydrivers.com/Img/20100730/02521937.jpg
     
  11. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,429
    Likes Received:
    428
    Location:
    New York
    Did Nvidia beef up GF104's texture units? Was just browsing Damien's english review and it seems FP16 and RGB9E5 are now full speed as opposed to half speed on GF100.

    [​IMG]
     
  12. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,491
    Likes Received:
    909
  13. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,429
    Likes Received:
    428
    Location:
    New York
    Of course, thanks. Saw it on my second read through :) Wonder why they bothered.
     
  14. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,886
    Likes Received:
    211
    Location:
    Seattle, WA
    My first guess would be that it was something that was intended for the GF100 all along, but there was a bug in the hardware implementation that forced them to implement these modes with reduced performance.

    As for why they would have wanted to go this route in the first place, well, that would make sense if they feel that these modes will become more and more common as time goes forward, and if the added hardware cost was minimal.
     
  15. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,012
    Likes Received:
    112
    Maybe the full-speed fp16 was just a later addition which didn't make it for GF100.
    That said, it would imho make more sense for GF100 than GF104, since GF100 has lower tex:alu ratio (and also higher memory bandwidth / tex). Unless you think it doesn't matter for GF100 since it looks more useful for non-gaming usages anyway..
     
  16. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    3,984
    Likes Received:
    34
    Interesting that the fp formats have seen performance increases from GF100->GF104, but the int formats have seen performance decreases. Also, there appears to be a hard cap @ 33.3 GTexels/s for 3 of the formats. Any thoughts as to what might be causing this? Is it a lack of cache or cache bandwidth? Some other architectural limitation? I don't think it's a lack of VRAM or VRAM bandwidth since GF104 out-performs GT200b in 2 of the 3 formats.
     
  17. TKK

    TKK
    Newcomer

    Joined:
    Jan 12, 2010
    Messages:
    148
    Likes Received:
    0
    Also, if it was the case there should be a difference between the two GTX 460 variants, which isn't the case.
     
  18. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    It's the theoretical max throughput of the 56 TMUs * 0.675 GHz = 37.8 GTexel/s. Obviously the efficiency (88%) is slightly lower than on AMD GPUs (~98% or so) for this simple tasks.
     
  19. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,012
    Likes Received:
    112
    I think the more interesting comparison is GTX470/480 - 60 TMUs *0.7 GHz = 42 GTexels/s and it is achieving 41.4 GTexels/s (for int8 only though) - 99%. So for some odd reason GF104 can achieve less of the peak potential of the tmus.
     
  20. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,797
    Likes Received:
    2,056
    Location:
    Germany
    I'm showing (almost) the same here. 33.8 GTex is the maximum i can get out of a stock GF104 with bilinear filtering. With trilinear it's a more expected 18.9 GTex/s. Together with the point sampling result of - again - 33.8 GTex/s I'm guessing, it's maybe interpolation or adress bound.

    An HD5830 is literally miles away at 43.6 and 22.4 GTex/s.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...