NVidia Ada Speculation, Rumours and Discussion

troyan · Oct 3, 2022

GH100 got the Gaming-Ampere SMs. So you have to compare Hopper to Volta. For example A100 FP32 and FP64 went only up 30% while the transistor budget increased by 2.6x.

AD102 has 16x more L2 cache, 71% more compute units, 71% more GPCs (rasterizer and ROPs), improved RT Cores (2x the triangle intersection, two new hardware features), new geometry processing (Micro-Meshes), improved TensorCores, new shader reordering function, new optic flow accelator, new video encoder, 40%+ higher clocks and i guess a lot of the Hopper's compute features. Looking at the raw performance of a "full" AD102 it doesnt look so bad.

arandomguy · Oct 3, 2022

DavidGraham said:
I really want to know where all of the transistor budget went? A100 has 54billion transistors, and 19.5 TF of FP32 compute, fast forward to H100 and it has 80billion transistors and more than the triple the amount of FP32 compute, 67 TF. Yet Ada barely doubled FP32 despite spending close to 3X the transisor count.

A100: 54billion transistors, 19.5 TF of FP32 compute
H100: 80billion transistors, 67TF of FP32 compute

Ampere: 28 billion transistors, 40TF FP32
Ada: 76 billion transistors, ~90TF FP32

Something doesn't add up.

Part of this is an issue of numbers and how they're perceived.

For instance if you actually calculate the numbers out AD102 is 2.7x more transistors than GA102 for 2.25x more TF FP32 (using your 90 tflop FP32 number), this perceptually a much better ratio than roughly handling it as "close to 3x the transistor count" and "barely doubled FP32."

Now your ~90 TF FP32 number (for I assume a full AD102/RTX 3090ti equivalent) is also likely low at only 1.09x more than RTX 4090 (82.58). RTX 3090ti (40 tf) vs RTX 3090 (35.58 tf) is actually 1.12x more. Using that number a hypothetical 4090ti would have 92.49 TF or 2.31x more (we're creeping up).

RTX 4090 is also more cut down than 3090 was relative to the 3090ti, as such an actual full AD102 implementation may have an even higher ratio than that if not power restricted (and 3090ti was generous with power over 3090 with a higher boost clock spec used to calculate stock tflops). If we say move that up to ~98 TF (using about the same 150mhz boost 3090ti vs 3090) than that ends up at 2.45x TF more.

That would also still be a boost clock of under 2.7ghz, which judging by the leaks the silicon likely can do higher even. There was rumors of 600w configurations with over 100 TF from what I remember? 100 TF would make it 2.5x.

So ultimately AD102 vs GA102 could be more along the lines of 2.7x transistors for 2.5x more TF FP32, which sounds a lot better than "barely doubled FP32" for "close to 3x the transistor count."

Kaotik · Oct 3, 2022

troyan said:
GH100 got the Gaming-Ampere SMs. So you have to compare Hopper to Volta. For example A100 FP32 and FP64 went only up 30% while the transistor budget increased by 2.6x.

AD102 has 16x more L2 cache, 71% more compute units, 71% more GPCs (rasterizer and ROPs), improved RT Cores (2x the triangle intersection, two new hardware features), new geometry processing (Micro-Meshes), improved TensorCores, new shader reordering function, new optic flow accelator, new video encoder, 40%+ higher clocks and i guess a lot of the Hopper's compute features. Looking at the raw performance of a "full" AD102 it doesnt look so bad.

Micro meshes are part of RT core, not geometry side. OFA is also improved rather than new?

troyan · Oct 3, 2022

No, Micro Meshes are a new way of processing geometry - from the whitepaper:

The displaced micro-mesh is a new geometric primitive that was co-designed with the MicroMesh Engine in Ada’s Third-Generation RT Core.

Kaotik · Oct 3, 2022

troyan said:
No, Micro Meshes are a new way of processing geometry - from the whitepaper:

Oh, tried looking for anything outside RT core from the whitepaper, must have missed that since all the searches took me to RT stuff

Henry swagger · Oct 3, 2022

DavidGraham said:
I really want to know where all of the transistor budget went? A100 has 54billion transistors, and 19.5 TF of FP32 compute, fast forward to H100 and it has 80billion transistors and more than the triple the amount of FP32 compute, 67 TF. Yet Ada barely doubled FP32 despite spending close to 3X the transisor count.

A100: 54billion transistors, 19.5 TF of FP32 compute
H100: 80billion transistors, 67TF of FP32 compute

Ampere: 28 billion transistors, 40TF FP32
Ada: 76 billion transistors, ~90TF FP32

Something doesn't add up.

I think theres 56 billion for the sm cores and 20 billion transistors went all to the l2 cache and tensor cores

Deleted member 2197 · Oct 3, 2022

GPU Price Gouging Might Finally Be Over - ExtremeTech

With crypto mining dead and a ton of 30-series cards flooding the market, launch prices for RTX 4090 GPUs are surprisingly sane(ish).

www.extremetech.com

Generally speaking, Nvidia usually undercuts its partners when it launches its GPUs. It comes out first with its Founder’s Edition cards, which have always been less expensive than add-in board (AIB) prices. Then, a few weeks later the AIB boards come out with their own versions, which are usually more expensive than Nvidia’s cards. That’s because they have custom PCBs, advanced cooling, and so forth. This gives Nvidia the first bite at the apple and lets it suck up some early adopters. That doesn’t seem to be the case this time. Nvidia priced its RTX 4090 at $1,599, and on Newegg, some partner boards are listed at…$1,599. That’s a first, and a promising sign that GPU price gouging might finally be over. In the past, a GPU at that price would have debuted at $1,999 or higher.

It’s an encouraging sign that AIBs are offering RTX 4090s for the same price as Nvidia’s cards. Sure, there are some overclocked models with huge coolers going for higher prices, but at least we seem to have options.

PSman1700 · Oct 3, 2022

pharma said:
GPU Price Gouging Might Finally Be Over - ExtremeTech

With crypto mining dead and a ton of 30-series cards flooding the market, launch prices for RTX 4090 GPUs are surprisingly sane(ish).

www.extremetech.com

Finally, its not like most AIB's gpus where so much better anyways, if at all.

Clukos · Oct 3, 2022

Watercooled 4090 for 1749 is pretty good compared to the insanity that was the last 2 years.

Below2D · Oct 3, 2022

The era of crappy reference models is officially behind us. There isn't much that AIB models do that the Founder's Edition doesn't. It used to be that NVIDIA underclocked their GPUs in relation to what they were truly capable of. See the 980/Ti clocked at 900-980MHz when even bad ones could easily hit 1050MHz+. It made sense not to get the reference cards with bad blower coolers and non-guaranteed OC performance. At the time, you could find some AIBs that were faster than the reference models by 10%+ out of the box with lower temperatures and less noise.

Now though? You're lucky to get a 5% boost with aftermarket models and sometimes, their coolers are even worse than the Founder's Edition. To top it all off, NVIDIA has figured out how to maximize their clocks leaving little incentive to get AIB models that aren't even much better anyway.

pjbliverpool · Oct 3, 2022

The cheapest model is £1854 in the UK. Lol.

Below2D · Oct 3, 2022

pjbliverpool said:
The cheapest model is £1854 in the UK. Lol.

Deleted member 2197 · Oct 3, 2022

Just noticed some of the 4090's on Newegg accept cryptocurrency as payment.

Maillog · Oct 4, 2022

troyan said:
3090ti
FP16/32: 40 TFLOPs
Pixel Fillrate: 208,3 gpixel/s

troyan said:
4080 12GB:
FP16/32: 40 TFLOPs
Pixel Fillrate: 208,8 gpixel/s

Another interesting detail
3090ti-112 ROPs
4080 12gb-80 ROPs(28% less)
But Pixel fillrate are the same
Apparently Nvidia has upgraded its ROP blocks, I don't remember such an improvement since Maxwell

troyan · Oct 4, 2022

4080 12GB has a 40% higher boost clock...

Henry swagger · Oct 5, 2022

Overwatch 2 Runs at 500+FPS on RTX 4090 at 1440p Resolution

Overwatch 2 is out now, and NVIDIA is boasting impressive frame rate figures for the upcoming GeForce RTX 4090 graphics card.

wccftech.com

Crazy are the any 1440p 500hz monitors on the market?

DavidGraham · Oct 5, 2022

In that example the 4090 is 70% faster than 4080 12GB (3090 ~).

Phantom88 · Oct 5, 2022

Yeah, seems the 12 gig card is around 3090TI levels. Normal 4080 is close to 50% faster than 3080. 4090 is double 3080. Four times faster than a ps5

Scott_Arm · Oct 5, 2022

Henry swagger said:
Overwatch 2 Runs at 500+FPS on RTX 4090 at 1440p Resolution

Overwatch 2 is out now, and NVIDIA is boasting impressive frame rate figures for the upcoming GeForce RTX 4090 graphics card.

wccftech.com

Crazy are the any 1440p 500hz monitors on the market?

No but there should be 1440p360 soon. 1080p500 just coming out.

PSman1700 · Oct 5, 2022

4080 12gb for all its intend being called a ’4070’ actually performing like a 3090ti makes that gpu a quite good value and is promising for gpus further scaling down.
Thats a 4070 (sometimes even called 4060) performing like the fastest last gen GPU the 3090Ti.

NVidia Ada Speculation, Rumours and Discussion

troyan

arandomguy

Kaotik

Drunk Member

troyan

Kaotik

Drunk Member

Henry swagger

Deleted member 2197

Guest

GPU Price Gouging Might Finally Be Over - ExtremeTech

PSman1700

GPU Price Gouging Might Finally Be Over - ExtremeTech

Clukos

Bloodborne 2 when?

Below2D

pjbliverpool

B3D Scallywag

Below2D

Deleted member 2197

Guest

Maillog

troyan

Henry swagger

Overwatch 2 Runs at 500+FPS on RTX 4090 at 1440p Resolution

DavidGraham

Phantom88

Scott_Arm

Overwatch 2 Runs at 500+FPS on RTX 4090 at 1440p Resolution

PSman1700

Similar threads