NVIDIA discussion [2024]

Remij · Apr 26, 2024

trinibwoy said:
That’s not how prices are determined. At best costs create a floor for minimum pricing. They definitely do not determine maximum pricing. Competition and consumer demand drives that. The average person has zero idea what the manufacturer is “getting out of the chip” and it’s not part of the value proposition.

We’re not buying tiers or brand names. Your entire framework for thinking about this is based on immaterial attributes of the product.

I already said that. Forget about the damn pricing for a second. These cards... and all generations of cards.. are separated into tier ranges by Nvidia and AMD. There's acceptable price ranges for each tier of card.

Nah, It's entirely based off what value Nvidia thinks these tiers are worth relative to each other.

trinibwoy · Apr 26, 2024

Yep it tells us 4070 > 4060 and 5070 > 5060.

SlmDnk · Apr 26, 2024

Computex 2024 keynote on June 2 at 4 AM (Pacific Time).

https://events.nvidia.com/jensen-huang-taipei-keynote-2024

See NVIDIA founder and CEO Jensen Huang live on stage for a special keynote address.

Kaotik · Apr 26, 2024

SlmDnk said:
Computex 2024 keynote on June 2 at 4 AM (Pacific Time).

https://events.nvidia.com/jensen-huang-taipei-keynote-2024

Just to nitpick, it's "off-computex" as in not official part of Computex and a day early

Deleted member 2197 · Apr 26, 2024

It must really piss off the "official" keynotes. Keynotes fighting for view clicks ...

Though makes sense since Nvidia has an AI Summit on June 5 in Taipei.

trinibwoy · Apr 26, 2024

pharma said:
It must really piss off the "official" keynotes. Keynotes fighting for view clicks ...
Though makes sense since Nvidia has an AI Summit on June 5 in Taipei.

You think so? Nvidia probably won’t announce anything new after their big splash at GTC. All eyes will be on Lisa.

Deleted member 2197 · Apr 26, 2024

trinibwoy said:
You think so? Nvidia probably won’t announce anything new after their big splash at GTC. All eyes will be on Lisa.

Yeah, people are wondering about their AI products. Lack of MLPerf participation by AMD or any partners has generated speculation regarding the product.
Intel will likely draw interest and benefit from their participation in the benchmark.

Man from Atlantis · Apr 29, 2024

Nvidia AD102 dieshot

https://twitter.com/x/status/1784611359608680563

trinibwoy · Apr 29, 2024

If accurate then about 1/3 of each TPC is allocated to something besides the SMs. Polymorph engine perhaps? With the GPC level rasterizers & ROPs that’s a lot of silicon dedicated to 3D fixed function.

This should probably be in the architecture thread though.

DegustatoR · Apr 29, 2024

trinibwoy said:
Polymorph engine perhaps?

LDS and caches?

trinibwoy · Apr 29, 2024

DegustatoR said:
LDS and caches?

Those are inside the SM.

DegustatoR · Apr 29, 2024

trinibwoy said:
Those are inside the SM.

We don't know that really and we don't know why the SMs are marked as they are above either. In Nvidia's logical schemes there are nothing but 2 SMs in 1 TPC.

trinibwoy · Apr 29, 2024

DegustatoR said:
We don't know that really and we don't know why the SMs are marked as they are above either. In Nvidia's logical schemes there are nothing but 2 SMs in 1 TPC.

Look at the symmetry. That unmarked block is 1 per TPC. So likely not SM specific. We know that each SM has dedicated cache and LDS.

Also that block looks far too large to be the relatively small SM L1 (compared to the size of the 2M L2 slices).

DegustatoR · Apr 29, 2024

trinibwoy said:
Look at the symmetry. That unmarked block is 1 per TPC. So likely not SM specific. We know that each SM has dedicated cache and LDS.

A TPC can be horizontal instead of vertical as drawn above. In that case that central area would divide between 2 SMs in a pretty symmetrical fashion.

trinibwoy · Apr 29, 2024

DegustatoR said:
A TPC can be horizontal instead of vertical as drawn above. In that case that central area would divide between 2 SMs in a pretty symmetrical fashion.

View attachment 11226

But now your TPCs are asymmetrical. The original cut makes more sense.

Also the “Gigathread engine” work distributor seems to be missing in the picture.

DegustatoR · Apr 29, 2024

trinibwoy said:
But now your TPCs are asymmetrical.

Where? Seems about as symmetrical as when cut vertically. This way you're dividing the central area between both SMs and TPCs though which is what the layout suggests.

trinibwoy said:
Also the “Gigathread engine” work distributor seems to be missing in the picture.

Should be in the center.

trinibwoy · Apr 29, 2024

DegustatoR said:
Where? Seems about as symmetrical as when cut vertically. This way you're dividing the central area between both SMs and TPCs though which is what the layout suggests.

With your cut that central area is no longer symmetrical across TPCs. Either way it’s too large to be 128KB cache. My guess is it’s fixed function bits. With mesh shaders becoming a thing and triangle counts increasing it probably makes sense to keep the FF triangle setup stuff but Nvidia seems to have a lot of it and it’s not clear how much it helps in real workloads.

Arun · Apr 29, 2024

I don’t think we “know” exactly what is in the SM vs TPC - e.g. based on my testing, I am fairly confident the L0 instruction caches are per-multiprocessor, the L1 instruction caches are per-TPC (32KiB, used to be 12KiB on Volta) and the L1.5 instruction/constant caches (128KiB) are per GPC. But that’s not what either NVIDIA claims in their diagrams (L1 instruction cache per SM) or what the Volta dissected paper (which is really good but far from its only mistake).

A 32KiB cache isn’t going to take anywhere near that much area but it’s just to illustrate we don’t really know.

The fact that area is very RAM-light implies it might have a large percentage of wire-heavy/wire-only areas for communication to the L2 and GPC central area as well.

trinibwoy · Apr 29, 2024

Arun said:
I don’t think we “know” exactly what is in the SM vs TPC.

No we don’t but we can make some informed guesses based on per-TPC and per-SM resources. E.g. Nvidia has claimed each TPC has triangle setup, tessellation hardware etc so that’s gotta be in there somewhere.

Arun · Apr 29, 2024

trinibwoy said:
No we don’t but we can make some informed guesses based on per-TPC and per-SM resources. E.g. Nvidia has claimed each TPC has triangle setup, tessellation hardware etc so that’s gotta be in there somewhere.

Yep absolutely, that sets a minimum of what is in there, just there’s probably more than we know - e.g. if raytracing was mostly per-SM but shared a bit of logic per-TPC, how could we possibly know that? These kind of implementation details really aren’t relevant for marketing material.

What would be interesting is a H100 die shot if they did indeed turn out to remove graphics logic from most SMs/GPCs like they implied they did, but I’m not sure whether that’s completely true.

BTW Fritzchens Fritz already had an AD102 die shot for a while, but no public H100 ones that I know of or have ever seen

NVIDIA discussion [2024]

Remij

trinibwoy

Meh

SlmDnk

Kaotik

Drunk Member

Deleted member 2197

Guest

trinibwoy

Meh

Deleted member 2197

Guest

Man from Atlantis

trinibwoy

Meh

DegustatoR

trinibwoy

Meh

DegustatoR

trinibwoy

Meh

DegustatoR

trinibwoy

Meh

DegustatoR

trinibwoy

Meh

Arun

Unknown.

trinibwoy

Meh

Arun

Unknown.

Similar threads