Nvidia Ampere Discussion [2020-05-14]

trinibwoy · Sep 21, 2020

Jawed said:
Damn, I'd totally forgotten about ixbt:

https://www.ixbt.com/3dv/nvidia-geforce-rtx-3080-review-part1.html
https://www.ixbt.com/3dv/nvidia-geforce-rtx-3080-review-part1.html
Really excellent graphics card reviews, not the weak sauce of pretty much all English language sites. Translation to English is so good these days, too!

Most thorough review I've seen this round. Will have to remember to check them out for RDNA2 as well.

Rootax · Sep 21, 2020

Maybe a stupid question, but , is the driver a key component to "fully" utilize the double fp32 units (when it's not used for int32), or it's mostly a hardware thing ?

RedVi · Sep 21, 2020

Jawed said:
I was under the impression that 2000 series GPU already have G-sync working with their 2019 LG OLEDs.

2000 series doesn't have HDMI 2.1 output so can only do 4K G-Sync up to 60hz. Not exactly that special unfortunately.

Jawed · Sep 21, 2020

Rootax said:
Maybe a stupid question, but , is the driver a key component to "fully" utilize the double fp32 units (when it's not used for int32), or it's mostly a hardware thing ?

The driver contains the shader compiler. So it can make a difference. We could expect that NVidia will improve the shader compiler. On the other hand shader compilation is something you can refine years ahead of the silicon arriving.

I found an analysis I did 12 years ago on the compilation of Perlin Noise. Interestingly, on AMD's VLIW-5 GPUs, the utilisation was about 89%. G80 was about 5% more efficient.. This means that instruction dependency was not that significant.

So that makes me even more puzzled why Ampere is "slow". The texturing workload is not substantial, so I don't believe that's relevant.

Scott_Arm · Sep 21, 2020

https://www.reddit.com/r/nvidia/comments/iwt953

Looks like you can get a very minor overclock in the 50-70 MHz range with a slight undervolt, which actually beats trying to overclock with a +100MHz core offset on this Gigabyte card. I think the best play will be to set the power limit as high as the BIOS allows in msi afterburner or evga precision x1 and then find the highest frequency you can maintain under the power limit with undervolting. 100% stable clock is much better than having the clock jumping around. Interesting that there are some comments saying ray tracing is more sensitive to undervolting and you may find stable undervolts for raster games that crash in ray-traced games. It'll probably be necessary to use something like Port Royal to check overclock, undervolt results.

NightAntilli · Sep 21, 2020

The 3080 really reminds me of the Vega 56/64 cards.

Scott_Arm · Sep 21, 2020

NightAntilli said:
The 3080 really reminds me of the Vega 56/64 cards.

Yep.

trinibwoy · Sep 21, 2020

Nvidia releases a statement on the disastrous launch and promises to do better.

https://www.nvidia.com/en-us/geforce/news/rtx-3080-qa/

SimBy · Sep 22, 2020

trinibwoy said:
Nvidia releases a statement on the disastrous launch and promises to do better.

https://www.nvidia.com/en-us/geforce/news/rtx-3080-qa/

We began shipping GPUs to our partners in August, and have been increasing the supply weekly.

So how long does it take to 1st partners actually get the chips, 2nd partners make the cards, 3rd partners ship those cards to retailers around the world? Shipping anything around the world right now is a nightmare.

I don't see any real supply until 21. Same goes for AMD probably.

yuri · Sep 22, 2020

RTX 3000 are undoubtly the most powerful cards at the market. Unlike the Vegas.

However, statements like "wait for the games to catch up" or "this is not the full potential" bring back memories. HD 2900 was like: "This is a DX11 card, wait for games to catch up!". GTX 480 was: "Wait for games to finally utilize all the geometry stuff!". Vega was like: "Wait for the drivers to utilize DSBR, NGG, HBCC and games to use FP16!"...

Scott_Arm · Sep 22, 2020

I think the difference is all of the features for Ampere are standardized in DirectX Ultimate and are available on Xbox Series X. You don't have to optimize for Ampere specifically. It's just the standard feature set for D3D. Example: Mesh Shaders will leverage the compute power of Ampere, Xbox Series X and the upcoming RDNA2 gpus.

Clukos · Sep 22, 2020

I've ordered a 3080 TUF from overclockers uk within the first 3 hours and I'm probably going to get it in November (if even that), worst product launch I've witnessed the past decade

At least I can cancel and get an RDNA2 GPU if AMD delivers the goods.

Rootax · Sep 22, 2020

Well, if big navi is awesome, I can see the same kind of shortage...

CarstenS · Sep 22, 2020

yuri said:
However, statements like "wait for the games to catch up" or "this is not the full potential" bring back memories. HD 2900 was like: "This is a DX11 card, wait for games to catch up!". GTX 480 was: "Wait for games to finally utilize all the geometry stuff!". Vega was like: "Wait for the drivers to utilize DSBR, NGG, HBCC and games to use FP16!"...

Yep, except HD2900 was a DX10 card.

yuri · Sep 22, 2020

CarstenS said:
Yep, except HD2900 was a DX10 card.

Of course, you are right. Time flies...

Voxilla · Sep 22, 2020

From the ixbit review (BTW very welcome to see this kind of low level feature benchmarking again, brings back memories of hardware.fr), the TMUs reportedly have been upgraded, doubling texel read speed, that is when not using filtering. These kind of TMU reads are often used in compute shaders. That is pretty cool.

Jawed · Sep 22, 2020

Why doesn't GA100 have 128x FP32 per SM like GA102? Why does it only have 64? For a compute card that seems like a major omission.

Cat Merc · Sep 22, 2020

Jawed said:
Why doesn't GA100 have 128x FP32 per SM like GA102? Why does it only have 64? For a compute card that seems like a major omission.

Probably just die size reasons. NVIDIA couldn't make the die bigger even if they really wanted to, so they'd have to cut out other parts to do it. As for why not reduce SM count to fit double ALU per SM, balance of resources is the logical answer here.

troyan · Sep 22, 2020

FP32 doesnt matter for GA100. For training they will use TF32 per default.

DegustatoR · Sep 22, 2020

Another possible scenario is that GA100 was made considerably earlier than GA10x and the updated FP32/INT h/w wasn't ready for it. We've seen something similar between Volta and Turing previously.

Nvidia Ampere Discussion [2020-05-14]

trinibwoy

Meh

Rootax

RedVi

Jawed

Scott_Arm

NightAntilli

Scott_Arm

trinibwoy

Meh

SimBy

yuri

Scott_Arm

Clukos

Bloodborne 2 when?

Rootax

CarstenS

Moderator

yuri

Voxilla

Jawed

Cat Merc

troyan

DegustatoR

Similar threads