NVidia Ada Speculation, Rumours and Discussion

PSman1700 · Sep 23, 2022

Deleted member 2197 · Sep 23, 2022

NVIDIA GeForce RTX 4090, 4080 16 GB, 4080 12 GB Custom Models Listed By OCUK, Prices Range From £949 To £1999

Overclockers UK has listed NVIDIA GeForce RTX 4090, RTX 4080 16 GB & RTX 4080 12 GB custom models along with their preliminary prices.

wccftech.com

https://www.overclockers.co.uk/inno3d-geforce-rtx-4090-x3-24gb-gddr6x-pci-express-graphics-card-gx-08j-in.html

troyan · Sep 23, 2022

Subtlesnake said:
AD106 looks like it's only going to be about as fast as the 3070 in rasterisation. I doubt that will cut it against Navi 33, so it makes sense to use the 3000 series to fill the gap.

How exactly will Navi 33 be faster? Navi 33 will have less from everything:
FP32 performance, rasterizing, geometrie performance, TensorFlop, raytracing performance etc.

/edit: Fun fact:
AD106 will deliver more FP32 performance (25 TFLOPs should be possible) and FP16 TensorFlops than a CDNA2 chiplet. But i guess AMD has no problem to provide the same numbers in 6nm at 200mm^2.

Deleted member 2197 · Sep 23, 2022

NVIDIA GeForce RTX 40 Series PCIe Gen 5 Power Adapters Have a Limited Connect & Disconnect Life of 30 Cycles

The PCIe Gen 5 12VHPWR connector may have a limited connect & disconnect life cycle on NVIDIA's GeForce RTX 40 series graphics cards.

wccftech.com

We have confirmed with NVIDIA that the 30-cycle spec for the 16-pin connector is the same as it has been for the past 20+ years. The same 30-cycle spec exists for the standard PCIe/ATX 8-pin connector (aka mini-fit Molex). The same connector is used by AMD and all other GPU vendors too so all of those cards also share a 30-cycle life. So in short, nothing has changed for the RTX 40 GPU series.
...
The next-gen NVIDIA GeForce RTX 40 series graphics cards including the RTX 4090 and TX 4080 will be bundled with such cables however, each package will include just one cable so enthusiasts who like to unplug their hardware a lot and try new stuff will have to be careful because you will need new cables every time you run out of the 30 life cycles period.
...
This needs to be verified once the launch of card but for new owners or those who plug their GPU inside the PC once and never take it back out again, this shouldn't be a huge concern. Certain PSU companies are also making their cables with higher quality components but there's no way to tell just how good or bad a 12VHPWR cable is without removing the sleeves.

DegustatoR · Sep 23, 2022

davis.anthony said:
Seen some guy on Twitter claim that Nvidia are moving the low/mid tier 3000 series GPU's to 5nm and will use the shrink to increase clock speeds by ~50-60% and increase performance that way.

That would keep DLSS3 exclusive to the bigger 4000 series cards while offering 40-50% performance increase through clock speed bumps for the low/mid 4000 cards.

If they did that I'm not sure how I would feel about it, on the one hand they would be offering a good performance uplift but at the same time still locking out the new tech innovations.

I call BS because you can't "move" anything from Samsung's 8N to TSMC's N5, this would be the same as making a completely new chip in which case there are zero reasons to use the old IP.

CarstenS · Sep 23, 2022

DegustatoR said:
I call BS because you can't "move" anything from Samsung's 8N to TSMC's N5, this would be the same as making a completely new chip in which case there are zero reasons to use the old IP.

There's one reason: If you think, perf is sufficient even with the old IP and you can get it done in less square millimeters compared to newer, potentially more spacious IP blocks. And of course, if you think your products/brand is strong enough and you want to create customer incentives to move their purchase upwards in the stack.

TopSpoiler · Sep 23, 2022

https://twitter.com/x/status/1573190256492711936

DegustatoR · Sep 23, 2022

Henry swagger said:
Has nvidia given bandwidth numbers for the 96 l2 cache like amd did with infinity cache ?..

Compared to Ampere, Ada’s Level 2 cache has been completely revamped. AD102 has been outfitted with 98304 KB of L2 cache, an improvement of 16x over the 6144 KB that shipped in GA102. All applications will benefit from having such a large pool of fast cache memory available, and complex operations such as ray tracing (particularly path tracing) will yield the greatest benefit.

That's all.

CarstenS said:
There's one reason: If you think, perf is sufficient even with the old IP and you can get it done in less square millimeters compared to newer, potentially more spacious IP blocks. And of course, if you think your products/brand is strong enough and you want to create customer incentives to move their purchase upwards in the stack.

While true it would be even cheaper to just continue selling the same Ampere chips made on 8N process.

Deleted member 2197 · Sep 23, 2022

https://twitter.com/x/status/1573281577420951552

Subtlesnake · Sep 23, 2022

troyan said:
How exactly will Navi 33 be faster? Navi 33 will have less from everything:
FP32 performance, rasterizing, geometrie performance, TensorFlop, raytracing performance etc.

/edit: Fun fact:
AD106 will deliver more FP32 performance (25 TFLOPs should be possible) and FP16 TensorFlops than a CDNA2 chiplet. But i guess AMD has no problem to provide the same numbers in 6nm at 200mm^2.

I was talking purely about rasterisation, so the RT and TensorFlop numbers aren't relevant. From the rumoured specs it would be 48 ROPS (3 GPCs) and 144 TMUs on AD106 against a presumed 64 ROPS and 128 TMUs on NAVI 33. If the boost clocks are 2.6 GHz and 2.8 GHz respectively say, then AMD would be behind by ~4% on texture rate and ahead by ~44% on fillrate. Nvidia would be ahead by ~4% in compute.

But if I understand correctly, Nvidia still have to share the extra FP32 units with the INT32 units, where as AMD is just doubling across the board. So we would expect AMD to get better scaling for their doubling. If they can increase performance at 1440p by ~31% vs the 6650 XT then they are already at 3070 level performance. If they can increase it by ~63%, then they are at 6800 XT levels. The rumours were that they were targeting 6800 XT/6900 XT performance, at least at 1080p.

troyan · Sep 23, 2022

Subtlesnake said:
I was talking purely about rasterisation, so the RT and TensorFlop numbers aren't relevant. From the rumoured specs it would be 48 ROPS (3 GPCs) and 144 TMUs on AD106 against a presumed 64 ROPS and 128 TMUs on NAVI 33. If the boost clocks are 2.6 GHz and 2.8 GHz respectively say, then AMD would be behind by ~4% on texture rate and ahead by ~44% on fillrate. Nvidia would be ahead by ~4% in compute.

But if I understand correctly, Nvidia still have to share the extra FP32 units with the INT32 units, where as AMD is just doubling across the board. So we would expect AMD to get better scaling for their doubling. If they can increase performance at 1440p by ~31% vs the 6650 XT then they are already at 3070 level performance. If they can increase it by ~63%, then they are at 6800 XT levels. The rumours were that they were targeting 6800 XT/6900 XT performance, at least at 1080p.

Narrow GPUs have less problems to fill the gaps to avoid performance bottlenecks. AD106 will around ~1/3 of the 4090 with even higher clocks. At least on 6nm AMD wont double everything. There own claim is >50% better efficiency with 5nm and chiplets.

Subtlesnake · Sep 23, 2022

troyan said:
Narrow GPUs have less problems to fill the gaps to avoid performance bottlenecks. AD106 will around ~1/3 of the 4090 with even higher clocks. At least on 6nm AMD wont double everything. There own claim is >50% better efficiency with 5nm and chiplets.

If it's really 3 GPCs/36 SMs, then that's 1/4 of the 4090. And Ampere mid range GPUs didn't see massively higher clocks. Right now kopite7kimi is claiming a 7000 TSE score, which is a bit better than the 3070.

I agree AMD isn't doubling everything on every product with RDNA 3, but they do seem to be doubling the number of ALUs per WGP. That should give a performance boost somewhere in between the Turing to Ampere "doubling" and the RDNA1 to RDNA 2 "doubling".

PSman1700 · Sep 23, 2022

You can find RTX2080Ti from as low as 250usd here in the used market. That's quite a great performance for the money even though its an old gpu by now. 3080 from 500usd and upwards from there (non EHT gpus).
If youre in the market for a new GPU and not afraid of the used market it can be worth it to check these out. Im myself on a 2080Ti which isnt far from 3070 performance, but with 11gb's of fast ram. RT performance is still great on these too.

homerdog · Sep 23, 2022

Calling 192bit AD104 a 4080 and charging $900 for it is a fucking disgrace. In the correct timeline, this is a 4060.

Subtlesnake · Sep 23, 2022

homerdog said:
Calling 192bit AD104 a 4080 and charging $900 for it is a fucking disgrace. In the correct timeline, this is a 4060.

Presumably in a timeline where 4N costs the same as 8N.

neckthrough · Sep 23, 2022

DegustatoR said:
I call BS because you can't "move" anything from Samsung's 8N to TSMC's N5, this would be the same as making a completely new chip in which case there are zero reasons to use the old IP.

Exactly. A shrink involves a full VLSI, place/route and tapeout cycle. In this case it’s even worse because it’s a foundry shift so it may involve changing RAM macros (basically different foundries use different SRAM bitwidths as building blocks), which is painful as hell. If you’re going through all that pain you would just use the latest IP.

neckthrough · Sep 23, 2022

CarstenS said:
There's one reason: If you think, perf is sufficient even with the old IP and you can get it done in less square millimeters compared to newer, potentially more spacious IP blocks.

Unless the less square millimeters in 4N costs more than the more square millimeters in 8N. We need to reset our mental calibration of Moore’s Law.

ninelven · Sep 23, 2022

Subtlesnake said:
Presumably in a timeline where 4N costs the same as 8N.

I guess if they had the same die size... let's see.

295 (ADA104) / 392.5 (GA 104) = ~75%. But that is the chip for the 3070 Ti....

Malo · Sep 23, 2022

If it was so expensive moving down nodes every generation, a mainstream level 4060 should cost about $3,000, with a 4090 requiring a home equity loan.

troyan · Sep 23, 2022

Apple is doing the same. Cheaper iphones are getting older SoCs.

NVidia Ada Speculation, Rumours and Discussion

PSman1700

Deleted member 2197

Guest

NVIDIA GeForce RTX 4090, 4080 16 GB, 4080 12 GB Custom Models Listed By OCUK, Prices Range From £949 To £1999

troyan

Deleted member 2197

Guest

NVIDIA GeForce RTX 40 Series PCIe Gen 5 Power Adapters Have a Limited Connect & Disconnect Life of 30 Cycles

DegustatoR

CarstenS

Moderator

TopSpoiler

DegustatoR

Deleted member 2197

Guest

Subtlesnake

troyan

Subtlesnake

PSman1700

homerdog

donator of the year

Subtlesnake

neckthrough

neckthrough

ninelven

PM

Malo

Yak Mechanicum

troyan

Similar threads