Nvidia Pascal Announcement

Discussion in 'Architecture and Products' started by huebie, Apr 5, 2016.

Tags:
  1. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    Well they didn't exactly say that they have chips in production at Samsung. This was the SEC filing from March 2015 - "We do not manufacture the silicon wafers used for our GPUs and Tegra processors and do not own or operate a wafer fabrication facility. Instead, we are dependent on industry-leading foundries, such as Taiwan Semiconductor Manufacturing Company Limited and Samsung Electronics Co. Ltd., to manufacture our semiconductor wafers using their fabrication equipment and techniques."

    AFAIK they didn't have any chips in production at Samsung at that time...

    Also..while of course they would play off each foundry against the other..they are heavily competing with Apple and Qualcomm for wafer capacity.
    Yep..pretty much agree with you on all this..the only slight benefit AMD may have is lower overhead.
     
  2. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    I am seriously saying what I said. I don't know where you're from, but here, prices for endusers mandatorily include VAT/sales tax whatever. Only shipping fees can apply. Maybe that makes a difference to what you're used to?

    Here's one of the FEs (typcial), starting from May 20th at 789 and recently come down to 720-730 EUR.
    http://geizhals.de/?phist=1441847

    And here the firs t(IIRC) AIB card, 659 since it's first listing on May 30th:
    http://geizhals.de/?phist=1449278
    (not in stock currently, but I know for a fact that there were some at different shops in the past.)

    I wouldn't like it either if someone forced at the point of a gun to go and buy these things. Lucky me, that's not the case.

    Same story as for 1080/1070 here: Readily available here in GER after applicable taxes etc. for 260-270 since launch.
    Heck, even a few select shops had limited stock of the 4-GiByte-models (of one of which I am the proud owner now) for 219 EUR. Granted, maybe it is even true that AMD seeded these in order to fulfill it's promise of 199 US-$ starting price (+taxes).
     
    #1622 CarstenS, Jul 7, 2016
    Last edited: Jul 7, 2016
    Lightman, Florin and pharma like this.
  3. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,244
    Likes Received:
    4,465
    Location:
    Finland
    Finland here, VAT (24%) always included in prices, no extra fees on top of that.
    Cheapest Founders Editions were 799€ on launch http://hintaseuranta.fi/tuote/msi-g...edition-8-gb-pci-e-naytonohjain/4685625#trend
    Cheapest AIBs (like Asus Strix linked here) same 799€ on their respective launch http://hintaseuranta.fi/tuote/asus-geforce-gtx-1080-gaming-8-gb-pci-e-naytonohjain/4685614#trend

    Some AIB versions which came week++ later than the first AIBs started under that, but the first round of AIBs were all there. Currently cheapest Founders is 769€ and cheapest AIB of all 729€
     
    CarstenS likes this.
  4. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Maybe that entry is sufficient to scare TSMC! :wink:
     
  5. Ryan Smith

    Regular

    Joined:
    Mar 26, 2010
    Messages:
    629
    Likes Received:
    1,131
    Location:
    PCIe x16_1
    It's all for compatibility purposes. Implement a hardware FP16x2 unit on GP104 exactly as there is on GP100 so that GP100 CUDA programs can be written and debugged on GP104. But don't make it fast enough that you'd actually want to deploy your application on a GeForce instead of a Tesla.

    A software solution would be faster (relatively speaking) and could behave slightly differently from GP100, two things that NVIDIA does not want. It is very, very well executed market segmentation. And NVIDIA sees that as more beneficial than enabling fast FP16 performance on the desktop for games.
     
    Heinrich04, pharma and spworley like this.
  6. Grall

    Grall Invisible Member
    Legend

    Joined:
    Apr 14, 2002
    Messages:
    10,801
    Likes Received:
    2,176
    Location:
    La-la land
    I like how you use measurements down to sub-atomic level. I'm sure that will reflect postitively in your final estimate! :)
     
    Kej, Alessio1989, Heinrich04 and 4 others like this.
  7. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    Wafers are expensive and business is business—every quark counts.
     
  8. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    So, GTX 1060 pricing has been announced for Europe including taxes etc.
    Germany: 279 EUR for AIB (AIB: 234 EUR excl. taxes), 319 EUR for SLFE which is only available through the nv-online-shop in UK, GER & FRA.
     
    pharma likes this.
  9. Andrew Lauritzen

    Andrew Lauritzen Moderator
    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,629
    Likes Received:
    1,227
    Location:
    British Columbia, Canada
    Yeah I get that, it's just that even if you wanted to get bit accurate results you could almost certainly still do it faster in software than 1/64. I guess it's maybe a tradeoff where adding one unit per SM or whatever is cheap enough to avoid the software hassle in the first place though.

    In any case it still sort of sucks that you get segmentation in an area that would actually benefit lower power stuff more. Doubles aren't a big deal because only HPC folks who don't know how to code really need them (I tease as someone who used to do that stuff so I've earned the right :)).
     
    Heinrich04 and Lightman like this.
  10. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    The early posts on the FP16x2 functionality indicated that in GP104 the instructions were emitted with barrier flags as if they were being handled similarly to SFU instructions or other shared hardware.
    That's not going to change the computational results, but if true it's a difference. Aside from the area cost and segmentation, it might have benefits to put a generally non-standard instruction type on a port that already handles the more varied behaviors compared to the SIMD issue ports.
    Given GP104's divergence in other parts of its configuration that leave some similarity to Maxwell, perhaps it saves some implementation effort by providing ISA support while leaving the more important SIMD execution loop of the SM undisturbed?
     
  11. ehart

    Newcomer

    Joined:
    Sep 20, 2003
    Messages:
    68
    Likes Received:
    6
    renderstate likes this.
  12. gongo

    Regular

    Joined:
    Jan 26, 2008
    Messages:
    605
    Likes Received:
    25
    I am interested to upgrade from 980Ti to 1080Ti. But i get conflicting views as to what is a 1080Ti..?

    GP100 vs GP102...is the number difference just to denote a HBM and non-HBM memory controller?
    Has Nvidia ever produced and maintained 2 separate lines of 'Big' gpu?

    Because GP100 have low boost clocks, around the same as GM200, and we seen Pascal relies heavily on a 2Ghz boost. If 1080Ti is to be ~40% faster than 1080, then it needs to have GP100 CC and 1080 boosts, if not the additional CC at 1.4Ghz gets restricted..

    Does this means that HBM is limiting high boost clocks? Fury did suffered from that.
    GP102 with DDR5X controller suddenly can boost the cores to 2Ghz to give us that premium high end GPU.
     
  13. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    832
    Likes Received:
    505
    Some sites claim there will be a Titan-P based on the GP100 with 16 GB HMB2.
    A 1080Ti would be based on a GP102 with 12 GB GDDR5X
     
    Heinrich04 likes this.
  14. HKS

    HKS
    Newcomer

    Joined:
    Apr 26, 2007
    Messages:
    32
    Likes Received:
    17
    Location:
    Norway
    Yes...
    For "Big Kepler" they have two GPUs.
    Titan, 780 Ti, Quadro K6000, Tesla K20/40 uses different revisions of GK110
    Tesla K80 uses GK210 which doubles register file and shared memory.
     
  15. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    They're basically trying to piss off RecessionCone.
     
    Kej, Razor1, spworley and 2 others like this.
  16. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    So, anyway, GTX 1060 looks like the card that RX 480 should have been, as far as day one reviews will play out.

    I do wonder whether just over 4 TFLOPS is a cut too far. I expect most review sites will be picking their games carefully to hide the compute shortfall.

    We shall see.
     
  17. RecessionCone

    Regular Subscriber

    Joined:
    Feb 27, 2010
    Messages:
    505
    Likes Received:
    189
    And succeeding at that...
     
    Kej, Razor1, Ryan Smith and 4 others like this.
  18. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Google's TPU accelerator works with 8 bit integer operations, not FP16, and are only usable for inference, not training. But they're probably also much higher volume.

    Those who can't afford custom silicon can use P100s for training with double rate FP16, and order(s) of magnitude more GP104s with quad rate 8 bit.

    I can't speak for RecessionCone, but that doesn't seem to be such a bad trade-off. The presence of quad 8 bit on cheap silicon may be a bigger plus than the lack of double rate FP16.

    Seriously???
     
  19. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    Stupid question: why does training require more precision?
     
  20. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    My very amateur explanation: training is all about back propagation in which an error at the output is ripples back towards the input and is used to make minute changes to the coefficients. You're basically doing a gradient descent.

    Coefficients start out with completely random values and are adjusted over millions of iterations.

    8 bits are too coarse to do this. But once the parameters are set, the accuracy actually doesn't matter a whole lot.
     
    Kej, pharma, Razor1 and 4 others like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...