Nvidia Volta Speculation Thread

Discussion in 'Architecture and Products' started by DSC, Mar 19, 2013.

Tags:
  1. Anarchist4000

    Anarchist4000 Veteran

    It'd be highly situational, ALU heavy code with limited occupancy.

    INT32 at full rate is more practical for addressing. Shift away from lower precision, hidden addressing units with unified memory as the addressing demands are growing. Larger pools and more frequent indexing into them.

    Probably fewer FP32 units as they shrink the die inline with GP102. GV102 I'd have to imagine is the top gaming chip at nearly half the size of GV100. Would be interesting to see a cost analysis of the double exposure. I could see going smaller than GP102 to ensure 4 HBM2 stacks fit. Lot of bandwidth and capacity for pro variants. For pro and compute work, bandwidth and capacity would be more practical than raw TFLOPs.
     
  2. CarstenS

    CarstenS Legend Subscriber

    So you know the Bytes*-per-FLOPS-ratio of GV102 already?

    *both total and per second
     
  3. pharma

    pharma Veteran

  4. Infinisearch

    Infinisearch Veteran

    Thanks.
     
  5. Anarchist4000

    Anarchist4000 Veteran

    That ratio need not be consistent. I'm only stating that some compute heavy workloads could warrant a higher ratio. A Titan tier chip could go 4 stacks just because it can. It could always look like a Vega10, but I'd guess it ends up more like P100 without FP64. Avoid the double exposure and make it fit within traditional dimensions. Perhaps even relaxed a bit for better yields, but keeping the Tensor cores where bandwidth is more a concern. V100 is so large it's essentially a new tier. Those HBM2 stacks wouldn't need to be clocked super high either. 8GB stacks at 75% or even less clocks would be very flexible and inline with existing models. Pairing Tensor disabled, cut chips with slower HBM2.
     
  6. MDolenc

    MDolenc Regular

    So HBM2 in a GV102 is a given? :razz: I would not be so sure on that one.
     
    CarstenS and DavidGraham like this.
  7. DavidGraham

    DavidGraham Veteran

    It's GDDR6, Hynix already announced they are incorporating it in a high end GPU, and they meant NVIDIA.
     
    Grall, pharma and sonen like this.
  8. CarstenS

    CarstenS Legend Subscriber

    Actually, you don't (or didn't), and it would not make any sense either, but anyway.

    Since it's not AMD and I'll be safe from anklebiting, I would say that for all we know about the current situation, Nvidia would be quite stupid to use HBM2 right now on something we envision as GV102 - both from a margin as well as a competitive point of view and especially in the light of G(ddr)6 being announced for early next year.
     
    pharma likes this.
  9. ImSpartacus

    ImSpartacus Regular

    Wasn't it generally agreed that the 384-bit GDDR6 graphics card mentioned by SK Hynix was probably GV102?

    https://www.anandtech.com/show/1129...gddr6-memory-for-graphics-cards-in-early-2018

    I mean, all but one Titan have released in Q1 (the outlier being August 2016's Titan X).

    • Titan - 2/2013
    • Titan Black - 2/2014
    • Titan Z - 3/2014
    • Titan X - 3/2015
    • Titan X - 8/2016
    • Titan Xp - 4/2017

    So it seems like Nvidia will probably debut Volta by replacing the Titan Xp in Q1 2018 with a slightly cut down GV102 and its 384-bit bus.

    EDIT It just dawned on me that April isn't Q1, so the Titan Xp released in early Q2. You get the idea.
     
  10. Anarchist4000

    Anarchist4000 Veteran

    Perhaps GV102 isn't the correct code here, but whatever is a step down from V100. As I said above, there is enough of a size gap to fit a whole new SKU(~471mm2 to 815mm2). So a GDDR6 and HBM2 variant may exist between expected V100 and GV104 parts. P100 was the last chip Nvidia designed around that size and they chose HBM2. GDDR6 at that scale would lack capacity, power efficiency, and bandwidth.

    In the case of Tensor operations, Google indicated, as expected, that bandwidth was the limiting factor. Even Volta is losing badly to the TPU2 in regards to efficiency and likely price. A cheaper server card that retains much of the bandwidth and capacity, without the oversized die, would make sense. The alternative leaves a really large gap in the lineup. GDDR6 is just too limited at serious performance levels. Nvidia would be left with a 12GB card facing 16/32GB Vega variants with better power efficiency, memory capacity/bandwidth, and unified memory implementation on X86. Still seems likely a license issue exists there, and I doubt Intel is feeling generous.
     
  11. CarstenS

    CarstenS Legend Subscriber

    GP100 was top-of-the-line as is GV100, no difference there. Of course, there could be another SKU at 600 mm², but there's nothing to indicate that coming from earlier strategy with Pascal.
    Google was limited while using 256 bit (g)DDR3 edit: no G, but DDR3 with a transfer rate of 30 GB/s!!, which is hardly comparable.
     
    Last edited: Sep 27, 2017
  12. xpea

    xpea Regular

    Not in my reality. V100 is 120TFops at 300W vs 45TFlops at 250W for TPU2. In fact, Google is investing more and more in V100 HGX racks.
    But anyway, it doesn't matter anymore, TPU team left google to start GROQ. So for now, TPU project is dead and has no future at google...

    Vega and power efficiency in same sentence :runaway: (please no undervolting nonsens, Nvidia cards undervolt too)
    I don't know in which parallel world you are living, but here on earth, Vega HBM2 barely compete with GDDR5 Pascal and you already make it a winner against an unannounced GDDR6 Volta part. seriously?
    [snipped insult]
     
    Last edited by a moderator: Sep 28, 2017
    CSI PC, Picao84 and pharma like this.
  13. xpea

    xpea Regular

    in a more factual note, Nvidia announced at GTC 2017 Beijing that all the big 3 Chinese IT giants - Alibaba Baidu Tencent - are adopting Volta HGX in their datacenter:
    Hardware suppliers are Huawei, Inspur and Lenovo:
    http://www.marketwatch.com/story/nv...sed-chips-with-chinese-tech-giants-2017-09-25
     
    Grall and pharma like this.
  14. gamervivek

    gamervivek Regular

    Hopefully they are better at it than they were with the 1GHz HBM2.

    No need for Titan V if gtx 2080 does the job. GV104 being the Titan card would be something though.

    nvidia cards can undervolt but to what degree? My custom gtx1070 can use around 30% more power than the 1070FE while only being 10% faster, Vega56 uses another 10% over the former while being around similar performance. If I push the power limit slider on V56 to -10%, it loses 30Mhz on the gpu which is around 2.3%. It's quite obvious that undervolting the gtx1070FE wouldn't lead to similar gains in efficiency as the custom 1070, I think that holds for Vega 56 as well.

    There is the die-size factor to account for, but the story would be much different regarding efficiency if AMD were the first to market with Vega and didn't have to push the chip and nvidia didn't have the GP102 to easily give them the performance crown.
     
    Lightman likes this.
  15. Bondrewd

    Bondrewd Veteran

    The story would be different if AMD didin't launch Vega (the uarch that heavily relies on software) with half-baked software surrounding it.
    On the scale of 1 to Fermi this is Fermi in terms of bad launches.
     
    xpea likes this.
  16. Picao84

    Picao84 Veteran

    I guess we can finish this silly "IF" argument with "If nVIDIA did not exist". :roll:
     
    xpea and DavidGraham like this.
  17. Bondrewd

    Bondrewd Veteran

    Actually we should we finish this silly "IF" argument with "if shareholders never existed".
     
    Picao84 likes this.
  18. seahawk

    seahawk Regular

    The highest power consumption measured by Tom´s hardware for a 1070 was 200W. RX56 Bios1 Balanced measured in at 220W.
     
  19. xpea

    xpea Regular

    Google Nvidia Max-Q laptop reviews and you will see that Pascal can be much more power efficient than what we see on desktop cards
     
    pharma and DavidGraham like this.
  20. Anarchist4000

    Anarchist4000 Veteran

    Why would it need undervolted? That would surely help, but nobody is undervolting in a server market. If anything they simply run cards slower to be more power efficient. The savings with memory types are well established, [snipped insult]

    So your reality broke down and you substituted figures to make it accurate? Take 4 chips, disabled three of them, then presumed the one remaining chip still consumed the same amount of power? I suppose that's one way to make 180TFLOPs@250W less than 120TFLOPs@300W.

    Google is investing in racks because there will always be some customers using different software. They're investing in AMD racks as well.

    [snip sniping]. Compute and graphics are entirely separate areas. Vega is already beating P100 in some tests as expected, yet you feel Nvidia's GDDR5 offerings are superior to even their largest chip? [snipped some more]
     
    Last edited by a moderator: Sep 28, 2017
Loading...

Share This Page

Loading...