Bondrewd
Veteran
Yes.Aldebaran is coming up as a two die
It depends.2x112 CU GPU
128 gigs or so.with uncertain amount of HBM2e
Yes.Aldebaran is coming up as a two die
It depends.2x112 CU GPU
128 gigs or so.with uncertain amount of HBM2e
On newer heterogeneous systems from AMD with GPU nodes connected via
xGMI links to the CPUs, the GPU dies are interfaced with HBM2 memory.
This patchset applies on top of the following series by Yazen Ghannam
AMD MCA Address Translation Updates
[[URL]https://patchwork.kernel.org/project/linux-edac/list/?series=505989[/URL]]
This patchset does the following
1. Add support for northbridges on Aldebaran
* x86/amd_nb: Add Aldebaran device to PCI IDs
* x86/amd_nb: Add support for northbridges on Aldebaran
2. Add HBM memory type in EDAC
* EDAC/mc: Add new HBM2 memory type
3. Modifies the amd64_edac module to
a. Handle the UMCs on the noncpu nodes,
* EDAC/mce_amd: extract node id from InstanceHi in IPID
b. Enumerate HBM memory and add address translation
* EDAC/amd64: Enumerate memory on noncpu nodes
c. Address translation on Data Fabric version 3.5.
* EDAC/amd64: Add address translation support for DF3.5
* EDAC/amd64: Add fixed UMC to CS mapping
Aldebaran has 2 Dies (enumerated as a MCx, x= 8 ~ 15)
Each Die has 4 UMCs (enumerated as csrowx, x=0~3)
Each die has 2 root ports, with 4 misc port for each root.
Each UMC manages 8 UMC channels each connected to 2GB of HBM memory.
Muralidhara M K (3):
x86/amd_nb: Add Aldebaran device to PCI IDs
x86/amd_nb: Add support for northbridges on Aldebaran
EDAC/amd64: Add address translation support for DF3.5
Naveen Krishna Chatradhi (3):
EDAC/mc: Add new HBM2 memory type
EDAC/mce_amd: extract node id from InstanceHi in IPID
EDAC/amd64: Enumerate memory on noncpu nodes
Yazen Ghannam (1):
EDAC/amd64: Add fixed UMC to CS mapping
So... yes, @Bondrewd was right - 128GB HBM2(e) -> 2 dies - 4 UMCs per die - 8 channels per UMC - 2GB per channel
Nope.So they never disable any stack, out of the 8 stacks in the PCB?
Volumes.Nvidia always disables one stack out of 6, for every A100 GPU
Tbh it's a question of AMD selling MI200 with half-sized stacks.Perhaps what @Bondrewd meant with the "more or less" part is that some SKUs may come with one or more stacks disabled.
A bit less but yea.We're looking at 256CUs * 64 ALUs each * 2 Multiply + Add * 2 FP32 RPM = ~100 TFLOPs FP32, or 200 TFLOPs FP16.
A bit less but yea.
They also say doubled BF16 rates to match FP16 which is nice (roughly 4x faster total with doubled CUs) however no faster int8/4 or sparsity which is interesting, both will stick with FP16 rates like before I presume. If they're going full on for FP64 I guess it makes sense, HPC not AI/ML focused:
One can also count the FP64 TF number per board for lulz.
A bit less than that, but yes.MI200 @ ~1.8GHz can put out 56TF DPFP
Sounds formidable.
Read ze threadany info about Aldebaran specs. ?
Not really.Even traditional FP64 markets like weather and fluid simulations are transitioning to AI/ML
Of course it does.Obviously, the key to this market opportunity is that CDNA2 doesn't need any software or ecosystem investment
Cringe again.Bluefield3
You have clearly no idea of what you are talking about.Of course it does.
Thank you Intel!
Cringe again.
Nothing outside of hyperscale gives a shit about SmartNICs, and hyperscale builds their own!
See EFA.
It's SYCLNo, Intel AI/ML stack won't benefit AMD
They're just barely programmable matrix engines; all stuff relevant is abstracted away.At least not fully as their AI/ML hardware is vastly different
Oh jeez Codeplay is literally writing a Level Zero backend for gfx9/you name it.The only (little) savior is the government investment in ROCm as part of FRONTIER deal.
Tell me something I don't know.You should open your eyes and see how Nvidia operates when launching a new AI accelerator
The opposite.They don't sale a GPU, they sale exclusively DGX systems the first 6 months and then their HGX reference platform for another months
?In other worlds, it means that all customers will have to buy Grace-Hopper-BF3 systems to get their hands on the next gen AI/ML performance
Irrelevant.they are already porting their hypervisors on BueField3
OH YES IT IS.It's no more GPU vs GPU or CPU vs CPU.
Oh jeez that's peak LARP.Grace-Hopper-BF3 + Nvidia software ecosystem (Hypervisor + CUDA) has no equivalent and it's a huge selling point