NVIDIA Maxwell Speculation Thread

Discussion in 'Architecture and Products' started by Arun, Feb 9, 2011.

Tags:
  1. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    The thing about virtualization is that it's implemented in hardware that is non-negotiable as far as x86 CPU cores go and it's not a significant increase in hardware cost. It can be fused on or off irrespective of elements like the die, IO, and other physical parameters.

    There are use cases for low-end virtualization that some buyers will pay money for, and it's also a prerequisite for many workloads for which buyers will pay massive amounts of money for.
    As such, there's some income on a higher-volume segment and income from very lucrative ones as well.

    For a DP-capable GPU, there isn't a good dividing line. A high-throughput DP device will need high bandwidth, but so does a high-performance gaming GPU.
    Their die sizes are going to be large no matter what, a GPU can hit its TDP with SP and DP, and various extras don't significantly change that there are going to be two large dies with the engineering and manufacturing costs that goes into each distinct ASIC.

    A more economically established high-end niche might change that, as Nvidia's high-end Tesla chips seem to indicate, but AMD's not holding the high ground there.
    Even then, the compute market is very focused on cost/performance, which is something more measurable than cost/virtualization (this tends to be more binary). On top of that, GPU compute is itself undercut by CPU products that frequently get better utilization in various workloads and which still have a vastly superior software situation.

    Even when CPUs lose, they can fall back to very lucrative markets where they still win.
    A DP-specific GPU that loses can go nowhere if the hardware isn't also similarly at the top of its class in SP. However, if that's the case you can't charge more for the DP hardware if it's always enabled, negating having it at all unless you jack up the prices on SP hardware.
     
  2. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,493
    Likes Received:
    474
    No GPU would have high DP rates at this point if there wasn't a market to pay for it. 1/4 or 1/2 rate DP adds significant cost to a GPU.
     
  3. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    IMG doesn't do HSR before tessellation.
     
  4. kalelovil

    Regular

    Joined:
    Sep 8, 2011
    Messages:
    568
    Likes Received:
    104
    As a side-note, I know of one (modern) instance where CPU floating point throughput was 'gimped' to provide market stratification for a single die.
    AMD's 'Caspian' mobile CPU.
    http://techreport.com/news/17567/amd-intros-new-notebook-platform-with-45nm-cpus
    It can and has been done.
     
  5. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,493
    Likes Received:
    474
    I never said they do though I'm not sure how you would know considering IMG hasn't discussed their tessellation implementation. I was speaking of a hypothetical tiling architecture and what's possible. This has gotten off topic though so we should leave this tangent.
     
  6. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,296
    Location:
    Helsinki, Finland
    Just wanted to say that HSR before tessellation is very much doable (and an already used technique in some rendering engines). However it would be pretty hard for the GPU, since it needs to know the maximum distance the vertices can move. To allow GPUs to do this, a similar feature than dx11 "conservative depth output" could be introduced to give the GPU the guarantees it needs (to use hi-z / tiling buffer to cull patches).
     
  7. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
  8. Xmas

    Xmas Porous
    Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,344
    Likes Received:
    176
    Location:
    On the path to wisdom
    http://www.khronos.org/registry/gles/extensions/EXT/EXT_primitive_bounding_box.txt
     
  9. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    They have filed patents for it though.
     
  10. xDxD

    Regular

    Joined:
    Jun 7, 2010
    Messages:
    412
    Likes Received:
    1
  11. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    That's more likely, but the TMU count is still erroneous.
     
  12. xDxD

    Regular

    Joined:
    Jun 7, 2010
    Messages:
    412
    Likes Received:
    1
    Perhaps It's gpu z wrong?
     
  13. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    Old GPU-Z 0.7.7, it calculate this GPU as Kepler 1xx - 138 = 1664 / 192 * 16.
    He should used 0.7.9 with GM204 support.
     
  14. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,112
    Location:
    New York
    So 16 CUs total with 3 disabled on the 970? Any guesses on die size. I figure <= 350mm^2.
     
  15. xDxD

    Regular

    Joined:
    Jun 7, 2010
    Messages:
    412
    Likes Received:
    1
    it isn't a too big difference between 970 and 980? What if that vga is 960 (or 980 has 15 CUs)?
     
  16. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    You missed the GM204 PCB leak? http://www.techpowerup.com/202714/is-this-the-first-picture-of-geforce-gtx-880.html
    This is more ~400mm².

    The driver they run these cards says "GTX 970" and there are also shop listings of 970 and 980.
    GM206 will be probably end of 2014 / early 2015 - 35x35mm package chips at Zauba shipped first in August, so probably 3 months to go.

    Here is a N16E-GT (GTX 970M?) which uses only 10 CU /SMM: http://compubench.com/device.jsp?benchmark=compu20&os=Windows&api=cl&D=NVIDIA+N16E-GT&testgroup=info
    Maybe today there are some other yield strategies, see Tonga at R9 285...
     
  17. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,112
    Location:
    New York

    Depends on the clocks I guess. At similar clocks it's about a 20% deficit. The 7950 at launch was ~25% behind the 7970. 670 was ~20% behind the 680. So even at slightly lower clocks a 13 CU 970 falls in that range.
     
  18. xDxD

    Regular

    Joined:
    Jun 7, 2010
    Messages:
    412
    Likes Received:
    1
    Interesting, thank you
     
  19. tviceman

    Newcomer

    Joined:
    Mar 6, 2012
    Messages:
    191
    Likes Received:
    0
    Wow only 15 SMM's? I thought for sure it'd have 20.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...