As Intel revealed back in June 2023, the Falcon Shores chips will take the massively parallel Ethernet fabric and matrix math units of the Gaudi line and merge it with the Xe GPU engines created for Ponte Vecchio. This way, Falcon Shores can have 64-bit floating point processing and matrix math processing at the same time. Ponte Vecchio does not have 64-bit matrix processing, just 64-bit vector processing, which was done intentionally to meet the FP64 needs of Argonne. That’s great, but it means Ponte Vecchio is not necessarily a good idea for AI workloads, which would limit its appeal. Hence the merger of the Gaudi and Xe compute units into a single Falcon Shores engine.
We don’t know much about Falcon Shores, but we do know that it will weigh in at 1,500 watts, which is 25 percent more power consumption and heat dissipation of
the top-end “Blackwell” B200 GPU expected to be shipping in volume early next year, which is rated at 1,200 watts and which delivers 20 petaflops of compute at FP4 precision. With 25 percent more electricity burned, Falcon Shores better have at least 25 percent more performance than Blackwell at the same floating point precision level at roughly the same chip manufacturing process level. Better still, Intel had better be using its Intel 18A manufacturing process,
expected to be in production in 2025, to make Falcon Shores and it better have even more floating point oomph than even that. And Falcon Shores 2 had better be on the even smaller Intel 14A process,
which is expected in 2026.
It is past time for Intel to stop screwing around in both its foundry and chip design businesses. TSMC has a ruthless drumbeat of innovation, and
Nvidia’s GPU roadmap is relentless. There is an HBM memory bump and possibly a GPU compute bump coming with “Blackwell Ultra” in 2025, and the “Rubin” GPU comes in 2026 with the “Rubin Ultra” follow-on in 2027.