D
Deleted member 2197
Guest
Pascal Secrets: What Makes Nvidia GeForce GTX 1080 so Fast?
http://vrworld.com/2016/05/10/pascal-secrets-nvidia-geforce-gtx-1080/
http://vrworld.com/2016/05/10/pascal-secrets-nvidia-geforce-gtx-1080/
http://vrworld.com/2016/05/10/pascal-secrets-nvidia-geforce-gtx-1080/In our initial talks with Nvidia and their partners, we learned that the GeForce GTX 1080 is coming to market in several shapes:
Stock GTX 1080 is clocked at 1.66 GHz, with Turbo Boost lifting it to 1.73 GHz. Founders Edition includes overclocking-friendly BIOS to raise the clocks to at least 2 GHz, and the presentation showed the chip running at 2.1 GHz. The main limiting factor for the overclocking beyond 2.2 GHz is 225 Watts, which is how much the board can officially pull from the power circuitry: 75 Watts from the motherboard and 150 W through 8-pin PEG connector. However, there are power supply manufacturers which provide more juice per rail, and we’ve seen single 8-pin connector delivering 225 W on its own. Still, partners such as ASUS, Colorful, EVGA, Galax, GigaByte, MSI are preparing custom boards with 2-3 8-pin connectors. According to our sources, reaching 2.5 GHz using a liquid cooling setup such as Corsair H115i or EK Waterblocks should not be too much of a hassle.
- GeForce GTX 1080 8GB
- GeForce GTX 1080 Founders Edition
- GeForce GTX 1080 Air Overclocked Edition
- GeForce GTX 1080 Liquid Cooled Edition
...
Search for performance lead the company to remove as much legacy options as possible, and you can no longer connect the GTX 1080 with an analog display. D-SUB15 is now firmly in the past, and you cannot make the connection work even if you use a 3rd party adapter. The rest of connectors include a 144Hz-capable DVI, three DisplayPort 1.4 and a single HDMI 2.0B connector.
In the search for absolute performance per transistor, Nvidia revised the way how their Streaming Multiprocessor works. When we compare GM200 versus GP100 in clock-per-clock, Pascal (slightly) lags behind Maxwell. This change to a more granulated architecture was done in order to deliver higher clocks and more performance. Splitting the single Maxwell SM into two, doubling the amount of shared memory, warps and registers enabled the FP32 and FP64 cores to operate with yet unseen efficiency. For GP104, Nvidia disabled/removed the FP64 units – reducing the double-precision compute performance to a meaningless number, just like its predecessors.
What is there is single-precision (FP32) performance, which stands at 9 TFLOPS. While the GP100 chip needs a Turbo Boost to 1.48 GHz in order to deliver 10.6 TFLOPS, GP104 clocks up to 1.73 GHz and that’s not the end. If you clock the GTX 1080 to 2.1 GHz, which is achievable on air – you will speed go past the GP100. We can already see the developers and scientists that need single-precision performance placing orders for air and liquid cooled GTX 1080s.
- GP100: 15.3 billion transistors, 3840 cores, 60 SM, 4096-bit memory, 1328 MHz GPU clock
- GP104: 7.2 billion transistors, 2560 cores, 40 SM, 256-bit memory, 1660 MHz GPU clock
For DirectX 12 and VR, the term Asynchronous Compute was thrown around, especially since AMD Radeon-based cards were beating Nvidia GeForce cards in DirectX 12 titles such as Ashes of The Singularity and Rise of the Tomb Raider. We were told that the Pascal architecture doesn’t have Asynchronous Compute, but that there are some aspects of this feature which qualified the card for ‘direct12_1’ feature set.
...
However, DX12 titles face another battle altogether, and that is delivering a great gaming experience. This is something where titles such as Gears of War Ultimate Edition or Quantum Break failed entirely, as Microsoft ‘screwed the pooch’ with disastrous conversions and limitations set forth by the Windows Store. Tim Sweeney event wrote an in-depth column on The Guardian stating what’s wrong with Microsoft. These days, game developers work hand in hand with both AMD and Nvidia in order to extract as much performance out of DirectX 12 as possible, which is needed for challenging VR environments.
...
Year and a half ago, after seeing that HBM1 is limited in capacity and that HBM2 memory won’t be available in real volume before 2017, Nvidia started to work with Micron’s team in Germany on building the ultimate performance GDDR5.
Manufactured in 20nm process, GDDR5X memory showed being overclocking-friendly even with the initial silicon. As the roadmap shows, the target was to hit the 10 Gpbs i.e. 2.5 GHz QDR. Given that the memory actually moves four times per cycle, it should be called Quad Data Rate, but the name GDDR SGRAM (Graphics Double Data Rate Synchronous Graphics Random Access Memory) was kept for continuity.
GeForce GTX 1080 has the memory clocked at 2.5 GHz but we do expect some of the samples clocking at 2.75-3.5 GHz (11-14 Gbps). That would raise the available bandwidth from 320GB/s to 352-448 GB/s and we do expect to see extreme overclockers pushing the memory even more. If Micron adopts 10nm process for GDDR5X, we’ll get to 4 GHz clock / 16 Gbps rather sooner than later.
Last edited by a moderator: