Nvidia Pascal Announcement

With the 290X, AMD promised a clock of 1GHz on the box. With the 1080, Nvidia promises a boost clock of 1733MHz.
Are there cases where this 1733MHz isn't reached?

Edit: never mind, I see on the later graph that it sometimes goes below 1733. But it never goes below the base clock. ;) Wasn't it agreed on that, for the 290X, AMD's mistake was to not list a base clock, and confuse people into thinking that 1GHz was a guarantee? You can avoid that by using a base/boost combo.
http://www.tomshardware.com/reviews/nvidia-geforce-gtx-1080-pascal,4572-11.html

Lowest clock for stock configuration is ~1460MHz.
 
not sure about the conclusion of hardocp lol ( since they have been fired by AMD ) anyway ..

CSI, your example are good for a card at stock, but the voltage is too modified when you overclock it . the TDP limitation is the simple and will explain this.. and finally voltatge is part of the TDP result. if their algorythm for set the right voltage is wrong, the result is the same, they hit the tdp quicker, but we dont speak about stock gpu there.. it is with overclocked gpu's . IF this voltage algorythm have some fail, it is not important.... when the bios who remove the limitation of TDP will be out, it will be removed by itself.. at a 300W limit, this will not have much importance ... Certainly same for the EVGA superclocked, or MSI, ASUS DC xxx ... this will their bios who will control the TDP .. IF the Greenlight program allow it ofc.

If you can provide a sample who dont hit the lmitation of TDP and still act the same ( and this will be worse than our expectations ) when overclocked .. ok.
TBH we have no idea how NVIDIA defined their algorithm for Boost3 when it now dynamically looks at both temperature and voltage to set frequency.
As you noticed those dips with the 100% fan and temperature at 63 degrees were much more civilised than the thermal limit chart but there is similarities in behaviour; which suggests there that it was some kind of voltage profile involved along with "limiter"/hitting the power limits briefly/oversensitive protection/etc.
So the solution is to tweak Boost3 (for voltage+thermal temp against frequency), or be more aggressive with fans for normal operation, or a combination of both.
I do not understand why they did not go with a higher fan speed with noise closer to the reference 970 profile that is roughly 3db louder according to TPU: http://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_1080/25.html

I think this is a great quick look at it so far by a technical reviewer: http://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_1080/29.html
Anyway it is something IMO that can be rectified, and easily if not messing around with OC by adjusting fan to be closer to profile behaviour of the 970.
Cheers
 
That was in a synthetic test though and strange, maybe has same effect as using certain synthetic tests on Intel Haswell that make it go crazy from a thermal stand point and also maybe kicking in some Boost3 protections with that low frequency.
TPU only managed to get their GPU to as low as 1607MHz by disabling the fan and hitting temperature 85degree and higher.....

clock_analysis2.jpg


Their summary analysis:
I did some testing of Boost 3.0 on the GeForce GTX 1080 (not using Furmark). First, the card is in idle, before a game is started and clocks shoot up to 1885 MHz. As GPU temperature climbs, we immediately see Boost 3.0 reducing clocks - with Boost 2.0, clocks stayed at their maximum until a certain temperature was reached.

We can see a linear trend that has clocks go down as the temperature increases, in steps of 13 MHz, which is the clock generator's granularity. Once the card exceeds 82°C (I had to stop the fan manually to do that), the card will drop all the way down to its base clock, but will never go below that guaranteed minimum (until 95°C where thermal trip will kick in).

This means that for the first time in GPU history, lower temperatures directly translate into more performance - at any temperature point and not only in the high 80s. I just hope that this will not tempt custom board manufacturers to go for ultra-low temperatures while ignoring fan noise.
http://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_1080/29.html
Cheers
 
His "game-like loads" could need some adjustments though, as you can see from Tom's results on Metro
 
Yes. Though 1080 runs the fan slower than Titan X and 980Ti: 2200 versus 2400

http://www.tomshardware.com/reviews/nvidia-geforce-gtx-980-ti,4164-8.html

How many benchmarks run for more then 3.5 minutes? That seems to be the earliest that the card throttles below its boost clock. Which is great for day 1 reviews.
Yeah I do not get what NVIDIA is trying to do with the fan profile for the 1080, unless they were concerned it is already louder than a 980.
But they still have room to play and be equal to a reference 970 that was ok from a noise perspective - from a reference perspective rather than custom AIB designs.
Cheers
 
His "game-like loads" could need some adjustments though, as you can see from Tom's results on Metro
Problem is with Tom's is that it is not looking at the relationship between thermal,voltage,fans,temperature in same level of detail as TPU for Boost3.
He had to use a constant "game-like load" to be able to see a correlation between temperature/frequency and the granularity-mechanism involved (albeit I doubt this is the only variable considering their is a profile also for voltage)
So not sure what he can adjust.
Cheers
 
Problem is with Tom's is that it is not looking at the relationship between thermal,voltage,fans,temperature in same level of detail as TPU for Boost3.
He had to use a constant "game-like load" to be able to see a correlation between temperature/frequency and the granularity-mechanism involved (albeit I doubt this is the only variable considering their is a profile also for voltage)
So not sure what he can adjust.
Power consumption varies by game. He chose the wrong game.

No one else "had to stop the fan" to get the card to hit base clock or lower.
 
Power consumption varies by game. He chose the wrong game.

No one else "had to stop the fan" to get the card to hit base clock or lower.
Your missing context of his article (he did tidy those charts up because window-sensitivity of a measurement would show he probably hit base clock briefly much earlier than that).
Same way one can go over the spec of power draw if looking at it from a too fine a window - so picking up brief "burst" like behaviour that is not a true reflection of behaviour.

He is measuring the granularity-mechanism behaviour for frequency against variable of temperature, although this still does not take into account voltage profiling in Boost3.

The simplest example to show that it is much more complex than before is look at Tom's charts again before it hits temperature limit, showing that more variables are involved and that there is a dynamic relationship going on.
Ask yourself why clock frequency performance starts to drop even before it hits 60 degrees and considering temp target is 83degree, while the OC is more consistent; it is more than just power target/limit involved there and comes back to some of the analysis TPU shows.

01-Clock-Rate_w_600.png



03-Temperatures_w_600.png


Cheers
 
Ask yourself why clock frequency performance starts to drop even before it hits 60 degrees
Rational explanation: physical properties of the chip degrades as temps go up (s/n ratio, whatnot); the chip simply overclocks more reliably at lower temps.

Conspiratorial explanation: NV is pushing the chip harder than it really handles to show high figures and secure good word of mouth and PR in the press; then reining in the clocks before the GPU crashes as it hits higher temps...

Which of the two is true I don't know. Maybe a combination of both, or something else I'm not thinking of. :p
 
The limiting factor for boost clock downward adjustments is power consumption, not voltage. Maximum clocks can be limited by both of these factors on air, but on water the only limitation is voltage (and there are no downward clock speed adjustments on water). I say this having owned and overclocked the following GPUs on air and water in the past 4 years: 680, 780, 780 SLI, 980 tri-SLI, 980 Ti, and 970. This can be proven through the use of synthetic stress tests like Furmark, Heaven, or 3dmark. One need only monitor power consumption and clock speed during these tests while adjusting clocks and voltage via an overclocking tool to reach this conclusion.

All that being said, I am rather disappointed in GP104's overall performance, though the sheer clock speeds are impressive though not unexpected for 16FF. I await GP102 before my next upgrade.
 
Rational explanation: physical properties of the chip degrades as temps go up (s/n ratio, whatnot); the chip simply overclocks more reliably at lower temps.

Conspiratorial explanation: NV is pushing the chip harder than it really handles to show high figures and secure good word of mouth and PR in the press; then reining in the clocks before the GPU crashes as it hits higher temps...

Which of the two is true I don't know. Maybe a combination of both, or something else I'm not thinking of. :p
Agreed,
but they do not degrade that much early on before they reach their peak temperature, otherwise we would had seen exact same behaviour with Maxwell and Boost2 - which is not as granular and dynamic.
It could be as you say, and conspiracy to keep it under control :)
I am going with it is the Boost3 algorithm and the finer granularity with temp & voltage profiles relationship with frequency not being currently ideal (as an example the EVGA Precision is still only alpha state).
Maybe when we have a fully functional death star, err I mean EVGA Precision we will see some changes hehe.
Would be very interesting to know if others will investigate or experienced that Fumark anomaly.
Cheers
 
Last edited:
Your missing context of his article
I'm not sure why you introduced TPU (in post 1024) when it contradicts Toms and other websites that report base clock or lower throttling. Toms shows throttling to base clock for extended periods of time in actual game play.
 
I'm not sure why you introduced TPU (in post 1024) when it contradicts Toms and other websites that report base clock or lower throttling. Toms shows throttling to base clock for extended periods of time in actual game play.
When it is at its target temp of 83 degrees......
It is clear to see that even in the chart I linked from Tom's in post 1031 (like others did earlier) showing that it only throttled to its 1607 base clock when it hit temp ceiling.
Fumark is an anomaly as it not only breaks the base clock but also breaks the temperature target, if we are lucky others will investigate this to identify why and its influence on Boost3.

Anyway another good read: http://www.gamersnexus.net/hwreview...nders-edition-review-and-fps-benchmark/page-3
Or PCGameshardware and their frequency after 10mins: http://www.pcgameshardware.de/Nvidi...5598/Specials/Benchmark-Test-Video-1195464/2/
Cheers
Edit:
Just to say the part of temp-frequency I am talking about from gamersnexus they :
To further amplify the thermal torture and create somewhat of a worst-case scenario, we also disabled all three front intake fans. This left the GPU entirely to its own devices – mostly the VRM blower fan and alloy heatsink / vapor chambers – to cool itself.
Anyway another example IMO highlighting that the issue is more to do with tweaking the dynamic nature of Boost3 and the multiple variables it monitors and uses, along with some simpler solution such as making the fan profile more aggressive.
 
Last edited:
GP100 does not have the 4x rate Int8 acceleration instructions designed for deep learning inference (these instructions are completely different than byte extract in GCN, to the question earlier in the thread). The 4x rate Int8 instructions are in the smaller Pascal chips (the ones besides GP100). Conversely, the smaller Pascal chips do not have the 2x rate FP16 that GP100 has.

NVIDIA should really explain this better, lots of confusion on the internet.
 
I don't remember NVIDIA ever mentioning these int8 instructions. URL?

It makes sense to support faster inference on SOCs that can be used in the data center or in the automotive market.
 
Its like my isp. They give u a burst of the advertised speed in the first few seconds but then it settles down lower.

Nvidia knew this behavior and still advertised the 1733 boost clock to claim a near 9Tflops theoretical peak. I guess its up to the consumer to read the fine print. Boost is not guaranteed. It goes higher or lower.

Gives me hope for polaris though. Nvidia chose to be aggressive. They were either mindful of AMDs cards or (probably more likely), needed it to be fast enough above a 980ti at stock to justify the price they needed to or wished to charge for it.
 
SAD operations were introduced with Southern Islands.
It wouldn't be until GCN3 that SDWA was added for byte granularity extraction for vector ops in general.

SAD still exists, so would that be the only motivation to expose generalized sub-word addressing?
 
Back
Top