Nvidia Pascal Announcement

Discussion in 'Architecture and Products' started by huebie, Apr 5, 2016.

Tags:
  1. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    NVIDIA partners halt GeForce GTX 970, 980, 980Ti production
    Yep, a mid-range GPU is displacing a high-end one from the previous generation.
     
    Razor1 and pharma like this.
  2. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Not surprising, given almost twice the transistor budget per mm². :) Remember GK104 <- GF110? There also was a process change involved.
     
    fellix likes this.
  3. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Indeed, this is what I thought earlier, that Nvidia will continue the tradition from the GTX680 launch. Well, I personally wouldn't mind a 195W GTX 980 Ti replacement.
     
  4. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Let's hope, AMD can pull the same stunt - and we're in for a pretty interesting round of products.
     
    nnunn and Grall like this.
  5. DuckThor Evil

    Legend

    Joined:
    Jul 9, 2004
    Messages:
    5,995
    Likes Received:
    1,062
    Location:
    Finland
    It is still a valid point by Adored that now this GP104 with a rumoured around 300mm2 chip (similar to GK104) will have to beat a very gaming optimized 600mm2 GM200 chip. GF110 was 520mm2 and had some FP64 "baggage".
    So at least to me it would be surprising if GP104 would be able to beat GM200 by the same margin as Kepler did Fermi. I also hope this victory doesn't come by clocking the chip much closer to its limits compared to Maxwell and then have it struggle against the 3rd party Maxwell models, which have 20% more stable performance than the stock clocked model.
     
  6. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Just for the record's sake the difference at the GTX680 launch to the former GTX580 was at ~25% in =/>1080p resolutions. Other than that assuming a 25Mio/mm2 transistor density at 300mm2 you get exactly 7.5b transistors. Given that there have been architectural changes for Pascal (relatively minor ones) and they've most likely also increased frequency (which is also an architectural change for the record and not a garden variety overclock), there's little to no indication yet that reaching or even exceeding by a small margin GM200 performance is impossible.

    Shall we dig up the database here or in all other fora how MANY called for BS when it was claimed that the GK104 is somewhat faster than the GF110 in the past?
     
  7. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    832
    Likes Received:
    505
    4 GPCs x10 SM ie 40x64 = 2560 FP32 cores, 160 TMUs, 64 ROPS and 1.5Ghz + clock would be enough to outrun GM200.
     
    nnunn likes this.
  8. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Depends where they'll draw the line for maximum power levels. But for the sake of speculative math with 20 clusters clocked at 1.4GHz you're already at almost 7.2 TFLOPs or nearly 20% above a 980Ti. The biggest question mark and most interesting IMHO is still how much more a Pascal FLOP "counts" than a Maxwell FLOP in a relative sense.
     
  9. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Speaking about clocks: Just wanted to reiterate that AMD said that one of the nicer things about 14/16 nm FinFET was that it had much less variance than 28nm. So higher clocks should be entirely possible, but OC-ability will somehwat be diminished since much potential is already utilized at the factory level.
     
  10. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    832
    Likes Received:
    505
    You must be counting FP16 FLOPS, as P100 has 60 clusters (unless you mean something different as SM).
     
  11. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    I think Ailuros meant 40, not 20 clusters as per context of the things he quoted.
     
  12. Pixel

    Veteran

    Joined:
    Sep 16, 2013
    Messages:
    1,008
    Likes Received:
    477
    So price per mm2 is about the same, so well get about the same # of transistors, which means no big leap in performance at any pricepoint. Probably a 15-30% jump in performance at any particular pricepoint. Nvidia milking their consumerbase and stretching out their roadmap as much as possible.
     
  13. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    I've compiled some plausible numbers here:
    Code:
                        GM200-400       GP104-400
    ----------------------------------------------
    Turbo Clock (MHz)      1075            1600
    FP32 FMA Op's          6144            5120
    Total GPR Size (MB)     6.1            10.2
    Total LDS Size (MB)     2.2             2.5
    L2 Size (MB)              3               2
    FP32 TFLOPs             6.6             8.2
    FP16 TFLOPs             6.6            16.4
    GTexels/s               206             256
    MTris/s                6450            6400
    GPixels/s               103             102
    Total LDS BW (TB/s)     3.3             8.2
    L2 BW (TB/s)           1.65            1.63
    
    Pascal Jr. looks to be a compute champ and just fine for graphics.
     
    nnunn and pjbliverpool like this.
  14. Ext3h

    Regular

    Joined:
    Sep 4, 2015
    Messages:
    428
    Likes Received:
    497
    I'm pretty sure that part is plain wrong. Both being "on-die" (there is still an interposer), and having a fixed 16GB limit. If I remember it right, in one of the presentations it was even said that there is currently a spacer on top of the 4GB stacks (rather than fitting the heat spreader directly!), so the upcoming 8GB HBM2 stacks will be physically compatible. Which means there is a 32GB variant planed, once the larger HBM2 stacks are shipping.
     
  15. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Why not wait a bit for actual prices before starting the round of complaints (about a product that you probably aren't going to buy anyway?)

    I remember outrageous price predictions being thrown around for GTX970 and we know how that turned out.

    If a 1070 is on par or faster with a 980 Ti for, say, $450, then we have a pretty nice price reduction for the same performance.
     
    Razor1 and A1xLLcqAgt0qc2RyMz0y like this.
  16. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    You do. HBM gen2 will physically be larger than HBM gen1, but in itself, the stacks are identically sized, ranging from 3mKGSD to 9mKGSD.
     
  17. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Since my search for the string „partitioned register file“ yielded no result and the dates for application could be fitting for the Pascal generation:
    https://www.google.de/patents/US20150143061
    Seems in line with larger # of registers per ALU block.
     
  18. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    I still have to get used to the fact that it's now 64SPs/SM and not 128 mind you :p
     
  19. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    There is no variant of HBM1 with 9 dies.
     
  20. nnunn

    Newcomer

    Joined:
    Nov 27, 2014
    Messages:
    40
    Likes Received:
    31
    "dark silicon" as part of cooling solution? Cool idea!
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...