Nvidia Pascal Speculation Thread

Discussion in 'Architecture and Products' started by DSC, Mar 25, 2014.

Tags:
Thread Status:
Not open for further replies.
  1. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    I think the 10x is specifically for inter-GPU transfer. So PCIe against NVLINK? The HBM is the 4x.
     
  2. spworley

    Newcomer

    Joined:
    Apr 19, 2013
    Messages:
    146
    Likes Received:
    190
    Jen-Hsun prefaced this section of his keynote with multiple caveats, saying even at the time he knew people would mischaracterize it.
    He had just introduced the deep-learning focused DIGITS workstation with four Titan Xs. Pascal will have about 2X the FP32 flops as Maxwell, and 4X the FP16 flops. For neural net learning, FP16 is OK! Memory bandwidth is boosted by about 6X. So overall speedup is about 5X (4X FLOPS, 6X bandwidth). He again stressed how rough this rounding was. And then the Pascal version of the DIGITS workstation can practically connect not 4 (via PCIE), but 8 GPUs (with NVLink), giving twice the number of GPUs per workstation. So that's 2X. 5X times 2X is 10X. The next generation of the DIGITS workstation will be very roughly 10X faster at deep learning training. And he followed up again by saying how this is rough CEO round-off handwaving math. The slide itself shown earlier in this thread even says in the corner "Very rough estimates!" though it's comically blocked by someone's raised arm in the photo.

    IMHO, he still made a mistake with the inflammatory title on the slide. It doesn't matter that he carefully added so many caveats, it was still just overhype. But I did get the feeling he was honestly excited, and to be honest, if you can use FP16 (like neural nets can), Pascal really will be a 4x boost in just a year or two. That's pretty impressive.
     
    homerdog, xpea, pharma and 1 other person like this.
  3. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    9,974
    Likes Received:
    1,491
    I don't think the problem was 16 bit back with the fx. The problem was that NVidia pushed 32bits as an advantage vs 24bit on amd hardware and then failed to even match the performance with 16bit enabled (behind the users back) in the most popular game of that year.
     
  4. Newguy

    Regular Newcomer

    Joined:
    Nov 10, 2014
    Messages:
    256
    Likes Received:
    112
    He didn't make a mistake, he knew exactly what he was doing, he knew it would get misinterpreted and "10x faster" would spread.
     
    firstminion likes this.
  5. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,413
    Likes Received:
    174
    Location:
    Chania
    You're barking at the wrong tree; I meant something completely different and far more recent, but it's not worth keeping the OT alive.
     
    pharma likes this.
  6. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,791
    Likes Received:
    2,049
    Location:
    Germany
    Now, this is how marketing works:
    - Nvidia showing a slide with an order of magnitude of improvement
    - Alongside, they offer decent explanation on how they arrive at this number including cautionary statement that it's an estimate
    - Devilishly engineered, the slide has a hidden ability to switch off higher brain functions of roughly 98.73% of it's audience [I suspect it has to do with the 10, rather than the ×)
    - Press & other people all over the world repeat the headline of the slide just because their brain functions are shut down.
    - Nvidia gets the headlines they want
    - Some of the 1.27 % whose brain functions were not affected (through extensive counter-marketing training or inherent lack of said functions in the first place) start a debate on how much Nvidia is an evil company, with the devil as it's CEO, who is running a suit company despite wearing a leather jacket (this must be the most devilish suite-in-disguise ever)
    - Nvidia goes into denial, stating (formally correct) that they explained convincingly how they arrived at this estimate
    - The „10ד sticks
    - Nvidia celebrates
    - 1.27-x % of the people reiterate this anecdote for decades to come, proving time and again how deceivingly devilish and abhorringly evil Nvidia is.

    So, what about the Geforce 3 being 7× fast than Geforce 2? That was a really gross lie, right? ;)
     
    elect, 3dcgi, entity279 and 3 others like this.
  7. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    Funny thing about the geforce FX, it was good at running Doom 3 but nothing else, able to get 60 fps in that one. But the pixel shading was done at 32bit in the end. It worked if the shaders were very tight - "falling off a cliff" happened very quickly.
     
  8. A1xLLcqAgt0qc2RyMz0y

    Regular

    Joined:
    Feb 6, 2010
    Messages:
    974
    Likes Received:
    265
    nnunn likes this.
  9. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,699
    Likes Received:
    117
    I suppose it is possible, but it would seem much more likely to be Tegra related IMHO.
     
    homerdog likes this.
  10. nnunn

    Newcomer

    Joined:
    Nov 27, 2014
    Messages:
    28
    Likes Received:
    23
    Thinking about recent discussions between these parties brings to mind Intel simply buying DEC's hardware division rather than argue about Alpha floating point pipes found in Pentium Pro. Not suggesting that Samsung would buy Nvidia, but maybe someone offered someone a good deal on 14nm, and everyone shook hands?
     
  11. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    647
    Likes Received:
    92
    Time to revive this thread. My info says big Pascal has taped out, and is on TSMC 16nm (Unknown if this is FF or FF+, though I suspect it is FF+). Target release date is Q1'16. This is a change from Kepler and Maxwell where the smallest chip (GK107 and GM107 respectively) taped out first. Maybe the experience with 20nm was enough for NV to go back to their usual big die first strategy. Given the huge gains in performance compared to 28nm, and the fact that the 16nm process is both immature and quite expensive, I suspect the die size may be a bit smaller than what we've seen with GK110/GM200.

    Still wondering what they're doing on Samsung 14nm though, maybe just test chips for now?
     
    iMacmatician and nnunn like this.
  12. Love_In_Rio

    Veteran

    Joined:
    Apr 21, 2004
    Messages:
    1,444
    Likes Received:
    108
    I hope you are right. A new PC with a GP100+Skylake in a mini ATX case for Q1'16 sounds really good and quite future proof.

    Nothing about R490 also taping out?. I hope AMD is not late this time. If Fury was Artic Islands based they would have already half the way done.
     
    #152 Love_In_Rio, Jun 4, 2015
    Last edited: Jun 4, 2015
  13. McHuj

    Veteran Regular Subscriber

    Joined:
    Jul 1, 2005
    Messages:
    1,410
    Likes Received:
    529
    Location:
    Texas
    I assume all pascal gpus will be HBM based so maybe that's not mature enough to be economical for anything other than high tier; hence, big Pascal coming first.
     
    Lightman likes this.
  14. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,108
    Likes Received:
    1,802
    Location:
    Finland
    Tegras?
     
  15. Alatar

    Newcomer

    Joined:
    Aug 26, 2014
    Messages:
    26
    Likes Received:
    18
    Big die first might have something to do with NV having to drop meaningful FP64 capability from GM200?

    Also isn't HBM2 (what pascal should use) still going to be pretty much unavailable in Q1 2016? I can't remember where I read that it was going to be available in mid 2016 but anyway Q1 sounds incredibly early.
     
    silent_guy likes this.
  16. Love_In_Rio

    Veteran

    Joined:
    Apr 21, 2004
    Messages:
    1,444
    Likes Received:
    108
    Last I read HBM 2 was ready for later this year. It has sense what you say about getting the DP monster processor back after ditching it in GM200.
    Or they could think with AMD efficiency improvements with Artic Islands(hopefully) a 7970 size chip in 14/16nm could be faster than a GP104 size one (if the rumors about Maxwell and Pascal being quite similar are true).
     
    #156 Love_In_Rio, Jun 4, 2015
    Last edited: Jun 4, 2015
  17. spworley

    Newcomer

    Joined:
    Apr 19, 2013
    Messages:
    146
    Likes Received:
    190
    Will HBM be economical for low tier parts in 2016 or even 2017? If manufacturing a low-margin, high quantity HBM GP107 isn't profitable because of the HBM expense, maybe GM206 will have a longer life to fill in the gap. Unless the low end Pascal parts stick with DDR5.
     
  18. McHuj

    Veteran Regular Subscriber

    Joined:
    Jul 1, 2005
    Messages:
    1,410
    Likes Received:
    529
    Location:
    Texas
    No idea, but I imagine long terms HBM will replace all external memory. (Wouldn't surprise me if Intel started stacking memory next to their CPU's)

    Let say Pascal does offer 2X performance of the Maxwell, you'll need to likely double the bandwidth even for lower tier parts as well as the capacity. There's going to be a point in time when a single 4GB HBM stack that offers 256 GB/sec, will be cheaper than the 4-8 GDDR5 chips needed for the same performance. It terms of final assembly, performance, and power savings HBM seems to have everything going for it (except the interposer stacking costs which should come down)
     
  19. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    I think the economical part is secondary to the question: does it make sense at all? Do we have conclusive evidence that Titan X performance is significantly hampered by memory BW?

    If it's only marginal, then it will probably makes sense to use HBM for the Big One on 16/14nm. But then the Titan X performance will move down to the 104 product, where it will be just as marginal as it is for the Titan X, so probably not worth doing.
    And then when 10nm comes along, things will move down one more step, so the 100 and 104 part will be HBM worthy, but the rest still won't.

    And price reasons will push against that trend.

    It does mean that the smaller SKUs will inevitably trend towards larger GDDR5 busses...
     
  20. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    766
    Likes Received:
    200
    Erinyes, does the big Pascal you mentioned actually have fast DP? (After GM200 I would just like to be sure.)
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...