AMD CDNA Discussion Thread

trinibwoy · Oct 23, 2021

Granath said:
https://wccftech.com/amd-instinct-m...a-4x-increase-in-fp16-performance-over-mi100/

Is it 1:1 or 2:1 FP32?

“Inside the AMD Instinct MI200 is an Aldebaran GPU featuring two dies, a secondary and a primary. It has two dies with each consisting of 8 shader engines for a total of 16 SE's. Each Shader Engine packs 16 CUs with full-rate FP64, packed FP32 & a 2nd Generation Matrix Engine for FP16 & BF16 operations.”

Bondrewd · Oct 23, 2021

trinibwoy said:
Is it 1:1 or 2:1 FP32?

Mostly the latter.

no-X · Oct 24, 2021

troyan said:
no-X said:

Yes, but MI200 hasn't like 20 or 30 % higher FP64 performance than A100, but allmost 5-times higher. Mixed precision is fine, but it's worthless once the task uses a specific precision.

Click to expand...

Sure. 5x better with FP64...

no-X said:

That's wishful thinking. GA100 is AI-focused. Its compute power is barely better than MI60 from 2018. Not only MI200 will be several-times faster in general compute, but A100 will lose the crown even in FP16 tensor, BF16 tensor, FP32 tensor (MI200 is almost 5-times faster). According to some leaks (not sure how reliable) MI200 will be also ~2,4-times faster in FP64 tensor. A100 will keep its position in INT4, INT8 and TF32 tensor.

Click to expand...

Sure. Reality again. GA100 delivers 320 TFLOPs FP16 performance. So no, AMD wont deliver more performance. I cant even believe that you think that AMD would even be able to do it.

https://twitter.com/i/web/status/1451906682968289287

So, as I said, MI200 is ~5× better in FP64 than A100 (47,9 to 9,7 TFLOPS), possibly more than 2.4× better in FP64 tensor than A100 (47,9 compared to 19,5 TFLOPS) and faster in FP16 than A100 (383 compared to 78/312 TFLOPS). How dare they?

troyan · Oct 24, 2021

A100 has 19,5 TFLOPs FP64 with 350W (or 300W 80GB PCIe). I never questioned that MI200 will deliver more FP64. That AMD is going up to 500W shows more how desperate they are to be in the AI-compute game. 18 months after GA100 it is worse than other competitions like Tesla. MI200 is a optimized HPC product for FP64 supercomputers.

Bondrewd · Oct 24, 2021

troyan said:
A100 has 19,5 TFLOPs FP64

Half that but yes!

troyan said:
That AMD is going up to 500W

?
H100 is 1000W so 500W is super tame by OAM standards.

troyan said:
how desperate they are to be in the AI-compute game

They dun give a fuck.
Matrix engine is mostly untouched gen over gen.

no-X said:
How dare they?

It's not even the truly funny product.
Just a pre-workout mix mostly.

no-X · Oct 24, 2021

troyan said:
A100 has 19,5 TFLOPs FP64 with 350W (or 300W 80GB PCIe).

…for tensor ops. For standard ops it's 9,7 TFLOPS.

Bondrewd · Oct 24, 2021

no-X said:
…for tensor ops. For standard ops it's 9,7 TFLOPS.

They're all kinda memey numbers (also A100 is 400W vanilla and 500W cooker for 80GB SKUs).
Either way all arguments against MI250 the hardware are also kinda memey since exabowl is Intel vs AMD with not a single nV GPU in sight...

trinibwoy · Oct 24, 2021

Bondrewd said:
They're all kinda memey numbers (also A100 is 400W vanilla and 500W cooker for 80GB SKUs).
Either way all arguments against MI250 the hardware are also kinda memey since exabowl is Intel vs AMD with not a single nV GPU in sight...

Is this exabowl only running for one season?

Bondrewd · Oct 24, 2021

trinibwoy said:
Is this exabowl only running for one season?

Those things are very not cheap so mostly one, yea.
Euros and China are looking into homegrown shit for exascale so that nulls out the big dGP three entirely.

xpea · Oct 24, 2021

trinibwoy said:
Is this exabowl only running for one season?

Well Intel is mostly out for next wave as they pissed off everybody with their 4 years delay...
Nvidia is already offering much cheaper exascale systems with Grace-Hopper and better real world performance thanks to faster infiniband and offload DPU acceleration. Ironically, even AMD can't offer the full stack of CPU+GPU+DPU+Infiniband that is the base architecture of modern HPC and that Nvidia is heavily promoting/selling now....

pharma · Oct 25, 2021

no-X said:
https://twitter.com/i/web/status/1451906682968289287
So, as I said, MI200 is ~5× better in FP64 than A100 (47,9 to 9,7 TFLOPS), possibly more than 2.4× better in FP64 tensor than A100 (47,9 compared to 19,5 TFLOPS) and faster in FP16 than A100 (383 compared to 78/312 TFLOPS). How dare they?

Not bad for a two silicon GPU.

Bondrewd · Oct 25, 2021

xpea said:
Nvidia is already offering much cheaper exascale systems with Grace-Hopper

Which is why El Capitan is MI300!
Lmao.

xpea said:
thanks to faster infiniband and offload DPU acceleration

No one wants that.

xpea said:
Infiniband

Ethernet won.
Pack up.

xpea · Oct 25, 2021

Bondrewd said:
Which is why El Capitan is MI300!
Lmao.

You can't measure the success of a product by few political government deals. I see financial earnings and market share where AMD is a dwarf in HPC and graphics compared to Intel and NV. MI300 won't change that. Whatever AMD does, at the end, they are severely limited by TSMC...

Bondrewd said:
No one wants that.

That's why Bluefield2 is exceeding sales expectation and BF3 has already a long list of customers waiting for delivery...

Bondrewd said:
Ethernet won.
Pack up.

Yeah sure...

Stop your FUD and go back home

Bondrewd · Oct 25, 2021

xpea said:
I see financial earnings and market share where AMD is a dwarf in HPC and graphics compared to Intel and NV

?
Looks like your eyes are still bad.

xpea said:
Whatever AMD does, at the end, they are severely limited by TSMC...

Way, way less so than Nvidia so dunno how that is relevant.

xpea said:
That's why Bluefield2 is exceeding sales expectation and BF3 has already a long list of customers waiting for delivery...

Hard to not beat expectations when there was no expectations.
Again, those things are only ever useful for hypervisor offload in VM favelas a-la Amazon Nitro...
...which brings on a whole other bag of funny given every SmartNIC customer of note can and will cook their own.

xpea said:
Yeah sure...

It sure did.
Even the new Cray recipe is just a slightly souped up Ethernet.
Pack up!

trinibwoy · Oct 25, 2021

Is there an independent source for what the industry is actually using? It seems everyone cherry picks to show their product in the best light - e.g. Intel likes to talk about 100G class interconnects. Presumably this would exclude a ton of stuff including run of the mill Ethernet.

Granath · Oct 25, 2021

xpea said:
You can't measure the success of a product by few political government deals. I see financial earnings and market share where AMD is a dwarf in HPC and graphics compared to Intel and NV. MI300 won't change that. Whatever AMD does, at the end, they are severely limited by TSMC...

That's why Bluefield2 is exceeding sales expectation and BF3 has already a long list of customers waiting for delivery...

Yeah sure...
View attachment 5939

Stop your FUD and go back home

political? 5 years electricity spendings is comparable to cost of the hardware.
and Epic is f.... efficient comparing to Intel. that's why it was chosen.

Bondrewd · Oct 25, 2021

trinibwoy said:
Is there an independent source for what the industry is actually using?

No, all kinda of NIC MSS data are wonky as hell.
But Cray using souped up Ethernet for their newest shiniest HPC offering tells a lot.

Granath said:
political?

No he means nV getting pwn'd out of El Capitan bidding.
Should've made a better APU!

Granath · Oct 25, 2021

Granath said:
political? 5 years electricity spendings is comparable to cost of the hardware.
and Epic is f.... efficient comparing to Intel. that's why it was chosen.

and don't forget about cpu-gpu memory coherency. Intel doesn't have it.

pharma · Oct 25, 2021

Granath said:
political? 5 years electricity spendings is comparable to cost of the hardware.
and Epic is f.... efficient comparing to Intel. that's why it was chosen.

I believe the reason it was chosen was intel didn't support PCIE-4. That's the only reason NV is using Epic in their DGX and HGX servers.

Granath · Oct 25, 2021

pharma said:
I believe the reason it was chosen was intel didn't support PCIE-4. That's the only reason NV is using Epic in their DGX and HGX servers.

not really, power efficiency in MW per Exaflop was one of their goals.
“Aurora” is burning twice as much electricity to deliver slightly less performance than “Frontier”. And at $1 per watt per year to keep a supercomputer running, it could cost close to $60 million a year power “Aurora”, which adds up to close to $240 million over four years. At only 29 megawatts, you are talking only $116 million for “Frontier”.

AMD CDNA Discussion Thread

trinibwoy

Meh

Bondrewd

no-X

troyan

Bondrewd

no-X

Bondrewd

trinibwoy

Meh

Bondrewd

xpea

pharma

Bondrewd

xpea

Bondrewd

trinibwoy

Meh

Granath

Bondrewd

Granath

pharma

Granath