Nvidia shows signs in [2023]

  • Thread starter Deleted member 2197
  • Start date
Status
Not open for further replies.
If they're found guilty I do hope they're held to accountability, it'd be such a nice change. (In general, not just nVidia's case)

If they're found guilty I can't imagine they wouldn't be held accountable, especially in France. My impression of France and other euro countries is that they're more willing to go after industry.
 
If they're found guilty I can't imagine they wouldn't be held accountable, especially in France. My impression of France and other euro countries is that they're more willing to go after industry.
If the EU can get Apple to change to USB-C than I guess anything is possible, here's hoping you're right!
 
Whole EU is looking into NVIDIA (and AI market as whole) now
 
Yeah curious what abuses are even possible in AI right now. There are already alternatives to Nvidia’s hardware and software.
 
Interesting. Going from 7nm GA102 to 5nm AD102 was a 70% increase in SMs and 50% increase in clocks. AD102 is really a beast.

192 SMs on GB102 would be a relatively small increase of just 33%. Maybe clocks are going up again.
If each SM still has 128 shader "units", or as Nvidia imprecisely states "CUDA cores", 192 SMs would give 24,576 shaders.
If Nvidia adopts some features from the Hopper generation, the number of ROP units per GPC cluster could be doubled. Since Ampere, their number depends precisely on the number of active GPCs in the chip (each adds 16 ROPs), so in Ada Lovelace's generation, a GPU with 12 GPCs would contain 192 ROPs (like the AD102). Here, the number per GPC could be doubled (so 384 ROPs) or possibly even quadrupled - but this is still only a possibility, nothing confirmed yet.
With 512-bit memory bus and up to 50% more computing units compared to today's GeForce RTX 4090 and possibly hhigher clock it's gonna be a hefty upgrade ;)

 
Last edited:
The double ROPs rumor doesn’t really make sense but we’ll see. Pixel shaders have been slowly disappearing from modern game engines. UE5 doesn’t use them. They’re not helpful in current compute heavy pipelines and do nothing for RT. 8K isn’t a thing yet and won’t be for a very long time. Would be an interesting choice.
 
The double ROPs rumor doesn’t really make sense but we’ll see. Pixel shaders have been slowly disappearing from modern game engines. UE5 doesn’t use them. They’re not helpful in current compute heavy pipelines and do nothing for RT. 8K isn’t a thing yet and won’t be for a very long time. Would be an interesting choice.

Funny enough there are a lot of users that buy 4090s to play Call of Duty at 1080p on low settings, or Counter-Strike, Fortnite etc. There's a pay to win market for the highest end cards, and ROPs might help just push more frames at low settings. The 4090 weirdly fits two groups: people who just want max ray-tracing at 4k and super low framerates, but also esports players.
 
Cinebench 2024 GPU benchmark.

Maxon-Cinebench-2024-GPU-Rendering-Scores.jpg


 
Cinebench 2024 GPU benchmark.

Maxon-Cinebench-2024-GPU-Rendering-Scores.jpg


Have they added RT acceleration support for others than NVIDIA yet? Since at launch at least they hadn't, which screws up the results.
 
Have they added RT acceleration support for others than NVIDIA yet? Since at launch at least they hadn't, which screws up the results.
RT acceleration is not used for any IHV.
 
Have they added RT acceleration support for others than NVIDIA yet? Since at launch at least they hadn't, which screws up the results.
Not by much, without RT acceleration for either IHVs, the 7900XTX is still behind the 4070 in the previous Redshift benchmark.


Blender recieved HIP-RT support for AMD recently, it only boosted the 7900XTX RT performance between 10% and 30%, and the 4080 still remains 2x times faster than the 7900XTX.

 
Those techgage scores definitely have RT acceleration enabled on Nvidia. Without acceleration the 7900 xtx matches the 4070.

So in the techgage benchmark of Redshift the 7900 xtx matches the 4070, so no acceleration is being used?
 
The 4070 is 23% faster in the techgage benchmark.
If rt acceleration was used the 4070 would be much faster than 23%. Techgage reviews are usually very detailed but in this I see no mention about RT acceleration so my assumption is it is turned off.

Edit: RTX 4070 tested in June using RTX enabled at Puget Systems.
4060ti_4070_Rendering_Redshift.png
 
Last edited by a moderator:
If rt acceleration was used the 4070 would be much faster than 23%. Techgage reviews are usually very detailed but in this I see no mention about RT acceleration so my assumption is it is turned off.

We don’t need to assume. The Puget article has results with acceleration turned on (4070 33% faster) and off (4070 3% faster) Why do you think the 4070’s advantage should be even greater? The cinebench workload does more than just casting rays.

Just compare the same card with acceleration on and off. Acceleration only improves 4070 performance by 28% which is right in line with its advantage over the 7900xtx in the techgage benchmark.
 
My bad, misread the Puget article graph subtitle. So with RTX enabled it's about 32% faster.
 
This week, EuroHPC confirmed that Nvidia was supplying the accelerators for the GPU Booster modules that will account for the bulk of the computational power in the Jupiter system.

To get 1 exaflops sustained Linpack performance, we think it might take 60,000 H100 PCI-Express H100s, which would have a peak theoretical FP64 performance of around 1.56 exaflops; on FP16 processing for AI on the tensor cores, such a machine would be rated at 45.4 exaflops. All of these numbers seem impossibly large, but that is how the math works out. Moving the SXM versions of the H100 would double the watts but only boost the FP64 vector performance per GPU by 30.8 percent, from 26 teraflops to 34 teraflops in the most recent incarnations of the H100 (which are a bit faster than they were when announced in the summer of 2022). Moving from 350 watts to 750 watts to get tighter memory coupling and a little less than third more performance is a bad trade for an energy-conscious European exascale system.
...
There is also a chance that Jupiter is based on the next-gen “Blackwell” GPUs, which could be a doubled-up GPU compared to the Hopper H100s with a much lower price and much fewer of them. So maybe it is more like 8,000 nodes with a Blackwell, which works out to 32,000 GPUs. We expect for Blackwell to be Nvidia’s first chiplet architecture, and that would help drive the cost down as well as the number of units required.
 
Status
Not open for further replies.
Back
Top