Nvidia Ampere Discussion [2020-05-14]

Accord1999 · Sep 6, 2020

Scott_Arm said:
The problem is to undervolt you'll basically run a benchmark at 100% load and then slowly adjust your voltage vs frequency curve so it flattens out peak frequency to lower and lower voltages until you start getting errors. You'll never really know how well it's going to work, or how low you'll be able to go, especially on a new architecture. So you're probably going to be running 100% gpu for some time near stock voltage as you lower it a bit at a time.

With Nvidia's Power Limit capability, you can go in the opposite direction. You set the maximum board power you want to use, the video card will downclock itself to a stable level and then you tweak upwards.

Scott_Arm · Sep 6, 2020

Accord1999 said:
With Nvidia's Power Limit capability, you can go in the opposite direction. You set the maximum board power you want to use, the video card will downclock itself to a stable level and then you tweak upwards.

Good point. I haven't seen anyone do it that way, but it might be interesting. I'm assuming power supply recommendations assume people will overclock, so maybe a 15-20% power reduction would work on a 650W.

pharma · Sep 6, 2020

Scott_Arm said:
Good point. I haven't seen anyone do it that way, but it might be interesting. I'm assuming power supply recommendations assume people will overclock, so maybe a 15-20% power reduction would work on a 650W.

I wonder if the beta Automatic Tuning feature could accomplish a similar result.

geforce-experience-new-performance-tuning-and-monitoring-options.png

https://www.nvidia.com/en-us/geforc...tform/#automatic-tuning-in-geforce-experience

arandomguy · Sep 6, 2020

swaaye said:
I think AV1 is near so maybe that isn't really worthwhile now. Actually why don't they have AV1 encoding at this point? Heh

My understanding is that AV1 encoding is still rather immature at this stage as in there is still a lot of gains happening (as in several factors) in terms of performance and performance/quality just from improvements on how it's done. I'd guess real time AV1 encoding for something like a consumer GPU (rather transistor and power sensitive) likely doesn't make sense at this point (if even possible given practical constraints) due to the immaturity especially given the lack of usage.

I might be wrong with this with regards to VP9 but the interest is that I think Twitch (and possibly other streaming platforms?) might be looking to start implementing VP9 relatively sooner while wider AV1 adoption might not be until closer to 2025. Then again maybe h.264 encode improvements can outrace essentially the benefits of moving to VP9?

trinibwoy said:
What does nvidia gain from providing reviewers with more accurate power testing equipment.

pharma said:
I assume reviewers will see what the variance is between the traditional method and the nvidia provided tool. It would be interesting to see if there is a significant difference.

Many (if not most?) reviewers, including rather notable ones, still use total system power consumption for power measurements. Also it's not always clear what they are actually measuring in terms of an "average" or whether or it not it's some peak figure, or if they have the capability of capturing data points beyond basically eyeballing a read out.

Man from Atlantis · Sep 6, 2020

3080 is 42%(25-62%) faster than 2080Ti on average in compubench

https://twitter.com/i/web/status/1302438546385416195

pharma · Sep 6, 2020

Re: RTX 3080 Compubench benchmarks

For instance, in the Vertex Connection and Merging test the RTX 3080 spotted by Apisak delivered a result of 39.042 mPixels/s (another tested sample recorded 39.128 mPixels/s – higher figure is better) compared to 24.621 mPixels/s (3080 first result: +59%) for the RTX 2080 Ti and 18.555 m/Pixels/s (3080 first result: +110%) for the RTX 2080.

In the more straightforward Ocean Surface Simulation test (simulating waves) the new Ampere unit managed 7768.469 Iterations/s, which was over 38% higher than the RTX 2080 Ti and a noteworthy 88% more than the RTX 2080 (see screenshots below).

In fact, in the Catmull-Clark Subdivision Level 5 benchmark, the Ampere card was even 60.78% faster than the RTX 2080 Ti.

https://www.notebookcheck.net/Nvidi...d-Big-Navi-a-hard-target-to-hit.492276.0.html

trinibwoy · Sep 6, 2020

I was expecting more of an advantage in synthetic Compubench tests given the massive flops increase. Clearly the bottleneck is elsewhere.

CarstenS · Sep 6, 2020

Scott_Arm said:
Good point. I haven't seen anyone do it that way, but it might be interesting. I'm assuming power supply recommendations assume people will overclock, so maybe a 15-20% power reduction would work on a 650W.

While doing F@H with a borrowed 2080 Ti, I did this all the time. Power-Limit to 90% or my 450W PSU would shut off when the card hit high load. With my similarly underrated (and now deceased) old PSU, i had to cap my Vega56 as well at 95%. edit: Yeah, with my next build, I won't cheap out on PSU-wattage any longer.

dorf · Sep 6, 2020

Any ideas whether CPU load (BVH stuff) in rtrt will ~same as with Turing or shifted more to GPU?

Picao84 · Sep 6, 2020

Do we know if when the NDA drops if we'll get reviews of RTX3070 as well? Or only in October?

Scott_Arm · Sep 6, 2020

pharma said:
I wonder if the beta Automatic Tuning feature could accomplish a similar result.

https://www.nvidia.com/en-us/geforc...tform/#automatic-tuning-in-geforce-experience

I’m sure that would work but so far there hasn’t been an automated tool that gets as good results as manual tweaking. Main thing I use is Kombuster with the artifact scanner to test stability.

Rootax · Sep 6, 2020

CarstenS said:
While doing F@H with a borrowed 2080 Ti, I did this all the time. Power-Limit to 90% or my 450W PSU would shut off when the card hit high load. With my similarly underrated (and now deceased) old PSU, i had to cap my Vega56 as well at 95%. edit: Yeah, with my next build, I won't cheap out on PSU-wattage any longer.

(And I love single big 12v rail, simpler to manage than multiples rails imo)

fellix · Sep 6, 2020

At this rate of Tensor logic investment, what are the chances that at some point in the future Nvidia will just fold all arithmetic ALUs in just more Tensor arrays?
The MMA programming model is already compliant with the standard grid/warp ordering of the conventional SIMT scheduling.

BRiT · Sep 6, 2020

Or why not just sell separate cards that only have Tensors, so now gamers need to buy a GPU and a TPU in order to game?

iroboto · Sep 6, 2020

BRiT said:
Or why not just sell separate cards that only have Tensors, so now gamers need to buy a GPU and a TPU in order to game?

Not even the AI line of GPUs would do this. Standard compute is still very necessary, not all machine learning uses the same types of computation.
Even the Vega cards are very good at certain types of algorithms.
Throwing out compute in favour of tensor cores is unlikely to ever happen. You need compute flexibility.

Or to put another way; tensor cores accelerate a type of machine learning. But we are always developing new methods and algorithms. The need for flexible compute is the enabler for that.

Frenetic Pony · Sep 6, 2020

pharma said:
Re: RTX 3080 Compubench benchmarks

https://www.notebookcheck.net/Nvidi...d-Big-Navi-a-hard-target-to-hit.492276.0.html

Guess this cements it as an average other than raytracing titles.

Looking at the numbers, if we take transistor count as a measure the chip is 50% bigger than a 2080ti, and performs 40% better and a bit better than that on raytracing. The die sizes versus process shrink roughly match up, so despite all the changes to parallelization, performance per relative die size hasn't improved at all, it's just a bigger chip. Performance per watt is better but not even close to as huge a jump as claimed, with around 20% improvement over a ti, if the TDP rating for a 3080 is accurate. The exception here being raytracing performance, which does better apparently.

So other than raytracing performance Ampere doesn't seem like a huge jump over Turing in sheer engineering terms. Thankfully there's competition from AMD now to drive up a jump in benefits to consumers though, there it does well even above the Turing Super series.

kalelovil · Sep 7, 2020

Frenetic Pony said:
Looking at the numbers, if we take transistor count as a measure the chip is 50% bigger than a 2080ti, and performs 40% better and a bit better than that on raytracing. The die sizes versus process shrink roughly match up, so despite all the changes to parallelization, performance per relative die size hasn't improved at all, it's just a bigger chip.

Memory bandwidth has only improved by 24% though.
And the RTX 3080 is more cut down than the RTX 2080 ti. 2 memory channels vs 1. 20% of SMs vs 5%. RTX 3090 vs Titan RTX is probably a better comparison.

troyan · Sep 7, 2020

GA104 is 392,5mm^2 and has 61% more transistors than TU106. RTX3070 will be around 70% faster than a 2060 Super in games while having the same bandwidth. Every transistor spent has result in the same performance increase. That is actually really good after the transistion from Pascal to Turing.

techuse · Sep 7, 2020

kalelovil said:
Memory bandwidth has only improved by 24% though.
And the RTX 3080 is more cut down than the RTX 2080 ti. 2 memory channels vs 1. 20% of SMs vs 5%. RTX 3090 vs Titan RTX is probably a better comparison.

+50% according to Nvidia so maybe a bit less when looking at independent review summaries.

pharma · Sep 7, 2020

RTX Ampere Technical summaries
https://www.computerbase.de/2020-09/nvidia-geforce-rtx-3000-ampere-technik/
https://www.pcgameshardware.de/Gefo...3090-RTX-3080-RTX-3070-Launch-kaufen-1357443/

Nvidia Ampere Discussion [2020-05-14]

Accord1999

Scott_Arm

pharma

arandomguy

Man from Atlantis

pharma

trinibwoy

Meh

CarstenS

Moderator

dorf

Picao84

Scott_Arm

Rootax

fellix

BRiT

(>• •)>⌐■-■ (⌐■-■)

iroboto

Daft Funk

Frenetic Pony

kalelovil

troyan

techuse

pharma

Similar threads