Nvidia Volta Speculation Thread

Discussion in 'Architecture and Products' started by DSC, Mar 19, 2013.

Tags:
  1. Anarchist4000

    Veteran

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    It'd be highly situational, ALU heavy code with limited occupancy.

    INT32 at full rate is more practical for addressing. Shift away from lower precision, hidden addressing units with unified memory as the addressing demands are growing. Larger pools and more frequent indexing into them.

    Probably fewer FP32 units as they shrink the die inline with GP102. GV102 I'd have to imagine is the top gaming chip at nearly half the size of GV100. Would be interesting to see a cost analysis of the double exposure. I could see going smaller than GP102 to ensure 4 HBM2 stacks fit. Lot of bandwidth and capacity for pro variants. For pro and compute work, bandwidth and capacity would be more practical than raw TFLOPs.
     
  2. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    So you know the Bytes*-per-FLOPS-ratio of GV102 already?

    *both total and per second
     
  3. pharma

    Veteran

    Joined:
    Mar 29, 2004
    Messages:
    4,891
    Likes Received:
    4,539
  4. Infinisearch

    Veteran

    Joined:
    Jul 22, 2004
    Messages:
    779
    Likes Received:
    146
    Location:
    USA
    Thanks.
     
  5. Anarchist4000

    Veteran

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    That ratio need not be consistent. I'm only stating that some compute heavy workloads could warrant a higher ratio. A Titan tier chip could go 4 stacks just because it can. It could always look like a Vega10, but I'd guess it ends up more like P100 without FP64. Avoid the double exposure and make it fit within traditional dimensions. Perhaps even relaxed a bit for better yields, but keeping the Tensor cores where bandwidth is more a concern. V100 is so large it's essentially a new tier. Those HBM2 stacks wouldn't need to be clocked super high either. 8GB stacks at 75% or even less clocks would be very flexible and inline with existing models. Pairing Tensor disabled, cut chips with slower HBM2.
     
  6. MDolenc

    Regular

    Joined:
    May 26, 2002
    Messages:
    696
    Likes Received:
    446
    Location:
    Slovenia
    So HBM2 in a GV102 is a given? :razz: I would not be so sure on that one.
     
    CarstenS and DavidGraham like this.
  7. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    It's GDDR6, Hynix already announced they are incorporating it in a high end GPU, and they meant NVIDIA.
     
    Grall, pharma and sonen like this.
  8. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Actually, you don't (or didn't), and it would not make any sense either, but anyway.

    Since it's not AMD and I'll be safe from anklebiting, I would say that for all we know about the current situation, Nvidia would be quite stupid to use HBM2 right now on something we envision as GV102 - both from a margin as well as a competitive point of view and especially in the light of G(ddr)6 being announced for early next year.
     
    pharma likes this.
  9. ImSpartacus

    Regular

    Joined:
    Jun 30, 2015
    Messages:
    252
    Likes Received:
    199
    Wasn't it generally agreed that the 384-bit GDDR6 graphics card mentioned by SK Hynix was probably GV102?

    https://www.anandtech.com/show/1129...gddr6-memory-for-graphics-cards-in-early-2018

    I mean, all but one Titan have released in Q1 (the outlier being August 2016's Titan X).

    • Titan - 2/2013
    • Titan Black - 2/2014
    • Titan Z - 3/2014
    • Titan X - 3/2015
    • Titan X - 8/2016
    • Titan Xp - 4/2017

    So it seems like Nvidia will probably debut Volta by replacing the Titan Xp in Q1 2018 with a slightly cut down GV102 and its 384-bit bus.

    EDIT It just dawned on me that April isn't Q1, so the Titan Xp released in early Q2. You get the idea.
     
  10. Anarchist4000

    Veteran

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    Perhaps GV102 isn't the correct code here, but whatever is a step down from V100. As I said above, there is enough of a size gap to fit a whole new SKU(~471mm2 to 815mm2). So a GDDR6 and HBM2 variant may exist between expected V100 and GV104 parts. P100 was the last chip Nvidia designed around that size and they chose HBM2. GDDR6 at that scale would lack capacity, power efficiency, and bandwidth.

    In the case of Tensor operations, Google indicated, as expected, that bandwidth was the limiting factor. Even Volta is losing badly to the TPU2 in regards to efficiency and likely price. A cheaper server card that retains much of the bandwidth and capacity, without the oversized die, would make sense. The alternative leaves a really large gap in the lineup. GDDR6 is just too limited at serious performance levels. Nvidia would be left with a 12GB card facing 16/32GB Vega variants with better power efficiency, memory capacity/bandwidth, and unified memory implementation on X86. Still seems likely a license issue exists there, and I doubt Intel is feeling generous.
     
  11. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    GP100 was top-of-the-line as is GV100, no difference there. Of course, there could be another SKU at 600 mm², but there's nothing to indicate that coming from earlier strategy with Pascal.
    Google was limited while using 256 bit (g)DDR3 edit: no G, but DDR3 with a transfer rate of 30 GB/s!!, which is hardly comparable.
     
    #631 CarstenS, Sep 26, 2017
    Last edited: Sep 27, 2017
  12. xpea

    Regular

    Joined:
    Jun 4, 2013
    Messages:
    551
    Likes Received:
    783
    Location:
    EU-China
    Not in my reality. V100 is 120TFops at 300W vs 45TFlops at 250W for TPU2. In fact, Google is investing more and more in V100 HGX racks.
    But anyway, it doesn't matter anymore, TPU team left google to start GROQ. So for now, TPU project is dead and has no future at google...

    Vega and power efficiency in same sentence :runaway: (please no undervolting nonsens, Nvidia cards undervolt too)
    I don't know in which parallel world you are living, but here on earth, Vega HBM2 barely compete with GDDR5 Pascal and you already make it a winner against an unannounced GDDR6 Volta part. seriously?
    [snipped insult]
     
    #632 xpea, Sep 27, 2017
    Last edited by a moderator: Sep 28, 2017
    CSI PC, Picao84 and pharma like this.
  13. xpea

    Regular

    Joined:
    Jun 4, 2013
    Messages:
    551
    Likes Received:
    783
    Location:
    EU-China
    in a more factual note, Nvidia announced at GTC 2017 Beijing that all the big 3 Chinese IT giants - Alibaba Baidu Tencent - are adopting Volta HGX in their datacenter:
    Hardware suppliers are Huawei, Inspur and Lenovo:
    http://www.marketwatch.com/story/nv...sed-chips-with-chinese-tech-giants-2017-09-25
     
    Grall and pharma like this.
  14. gamervivek

    Regular

    Joined:
    Sep 13, 2008
    Messages:
    805
    Likes Received:
    320
    Location:
    india
    Hopefully they are better at it than they were with the 1GHz HBM2.

    No need for Titan V if gtx 2080 does the job. GV104 being the Titan card would be something though.

    nvidia cards can undervolt but to what degree? My custom gtx1070 can use around 30% more power than the 1070FE while only being 10% faster, Vega56 uses another 10% over the former while being around similar performance. If I push the power limit slider on V56 to -10%, it loses 30Mhz on the gpu which is around 2.3%. It's quite obvious that undervolting the gtx1070FE wouldn't lead to similar gains in efficiency as the custom 1070, I think that holds for Vega 56 as well.

    There is the die-size factor to account for, but the story would be much different regarding efficiency if AMD were the first to market with Vega and didn't have to push the chip and nvidia didn't have the GP102 to easily give them the performance crown.
     
    Lightman likes this.
  15. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    The story would be different if AMD didin't launch Vega (the uarch that heavily relies on software) with half-baked software surrounding it.
    On the scale of 1 to Fermi this is Fermi in terms of bad launches.
     
    xpea likes this.
  16. Picao84

    Veteran

    Joined:
    Feb 15, 2010
    Messages:
    2,109
    Likes Received:
    1,195
    I guess we can finish this silly "IF" argument with "If nVIDIA did not exist". :roll:
     
    xpea and DavidGraham like this.
  17. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Actually we should we finish this silly "IF" argument with "if shareholders never existed".
     
    Picao84 likes this.
  18. seahawk

    Regular

    Joined:
    May 18, 2004
    Messages:
    511
    Likes Received:
    141
    The highest power consumption measured by Tom´s hardware for a 1070 was 200W. RX56 Bios1 Balanced measured in at 220W.
     
  19. xpea

    Regular

    Joined:
    Jun 4, 2013
    Messages:
    551
    Likes Received:
    783
    Location:
    EU-China
    Google Nvidia Max-Q laptop reviews and you will see that Pascal can be much more power efficient than what we see on desktop cards
     
    pharma and DavidGraham like this.
  20. Anarchist4000

    Veteran

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    Why would it need undervolted? That would surely help, but nobody is undervolting in a server market. If anything they simply run cards slower to be more power efficient. The savings with memory types are well established, [snipped insult]

    So your reality broke down and you substituted figures to make it accurate? Take 4 chips, disabled three of them, then presumed the one remaining chip still consumed the same amount of power? I suppose that's one way to make 180TFLOPs@250W less than 120TFLOPs@300W.

    Google is investing in racks because there will always be some customers using different software. They're investing in AMD racks as well.

    [snip sniping]. Compute and graphics are entirely separate areas. Vega is already beating P100 in some tests as expected, yet you feel Nvidia's GDDR5 offerings are superior to even their largest chip? [snipped some more]
     
    #640 Anarchist4000, Sep 28, 2017
    Last edited by a moderator: Sep 28, 2017
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...