Intel ARC GPUs, Xe Architecture for dGPUs [2018-2022]

Status
Not open for further replies.
Sounds like Intel took the easy and cheap way out, multi die GPU means it's consumer variant will have terrible gaming performance, especially that it needs very high skill in writing drivers.

The multi die strategy will meet much more success in compute or AI, though at 500 watts it seems their architecture have terrible efficiency.

500 watts

tenor.gif
 
If it's power efficient in compute, I'm sure there is market for 500 watt boards.
If it scales linearly single chip version should be around 125 watts, well within reasonable area.

I'm still hopeful, Intel managed to bring new features in DX11 era when features of GPUs had stagnated.
Perhaps they are bringing something exciting as we finally have some competitors bringing new features to GPUs.
 
Last edited:
If it scales linearly single chip version should be around 125 watts, well within reasonable area.
This multi die strategy means they took the coward in, instead of building a proper GPU, they will glue together small GPUs into a large one, a very crude method, which probably explains their terrible power draw.
 
500W!? Let's hope this is not the promised "7nm Data center GPU".
Xe-LP: Integrated in Tiger Lake, DG1 uses this but shouldn't ever release as a product (maybe in laptops if it can work together somehow with iGPU)
Xe-HP: One chip, but scalable, products 1, 2 and 4 chiplets which of last is for server hardware (as seen from 400/500W TDP and 48V input opposed to normal 12V)
Xe-HPC: Ponte Vecchio. Intel slides are tad confusing but apparently 1 Ponte Vecchio is 2x8 chiplets (but different from Xe-HP Tile-chiplets) and whatever the whole shebang was called uses up to 6 Ponte Vecchios
 
So a 2-chiplets GPU could become first gaming product?
Not sure if that's more worrying or exciting. But they have balls :D

I wonder if treating chiplets as a single GPU is the way to go. Someone said AMD thinks GPU chiplets would only make sense if this works.
To me it makes more sense to treat them individually, requiring to do things very different than before.
But it's hard to get a real win out of a dual core CPU vs. single core, because going parallel mostly means doing more work for the same result.
Considering this i would speculate chiplets would take off only with 4 of them or more.

But could be totally wrong. Very interesting, and impressive even if first product will not end up totally awesome... :)
 
I can't see Intel having nearly enough expertise with games drivers to pull off a multi-chip design that doesn't require each game developer to design for mGPU in their engine where neither Nvidia nor AMD have gone this route yet.

Data center? Sure, the software ecosystems are usually designed to support multiple GPUs.
 
I can't see Intel having nearly enough expertise with games drivers to pull off a multi-chip design that doesn't require each game developer to design for mGPU in their engine where neither Nvidia nor AMD have gone this route yet.
Maybe that's more a short time problem, while on the long run, being first with chiplets should guarantee to catch up more quickly.

Question also is: How will chiplet differ from mGPU? Are there more options than alternating frames or tiling the screen?

BTW, if we think what could be improved about rendering frames, it seems obvious to share RT BVH with rasterization. It could be used for efficient binning of triangles to small tile frustums, and coarse front to back draw order can improve HW occlusion culling a lot.
So if we get multi view, small frameport rendering, we could finally do efficient screen updates, e.g. updating small quads covering 20% of the screen, while reprojecting the rest. Similar to how some video compression works.

It's interesting to notice that chiplet GPUs do not cause any problems with such 'modern rendering' ideas, while they certainly do for 'traditional whole screen rasterization'.
 
At least the GPUs should have pretty nice bandwidth between each other, if the plan is to use Foveros-packaging
 
How about this idea the 2nd chip(let) doesnt need to be a full blown gpu, so how about you have a chip(let) that just contains for example 2,000 cu's
need a higher performing card add another cu chip
 
This multi die strategy means they took the coward in, instead of building a proper GPU, they will glue together small GPUs into a large one, a very crude method, which probably explains their terrible power draw.
Isn't this where Nvidia is heading as well?
At least they did do research on the topic.

Chiplets work nicely on CPUs, so I'm sure everyone is looking into them for GPUs.
How about this idea the 2nd chip(let) doesnt need to be a full blown gpu, so how about you have a chip(let) that just contains for example 2,000 cu's
need a higher performing card add another cu chip
Yup, I doubt they put full GPUs as chiplets.

Perhaps go for full Xenos and put ROPs on another chiplet. ;)
And yes, this will not happen, although I do wonder how they will handle memory on the thing.
 
Last edited:
Soo, anyone wanna take a shot at die size? :runaway:
(also which Xe GPU is it?)

It's lacking detail, at an angle and the reflection is pretty intense, but I think I see a few features.
There are horizontal bright bands at regular intervals, with the most obvious being right next to the thumb. I think the next repeats of this pattern are: about in the reflection of his glasses in the upwards direction, and above the base of the thumb in the downwards.
Repeating this pattern and trying to connect possible regions of similar appearance yields maybe two more such divisions in the upwards direction.

There are vertical stripes that show up more like gaps in the pattern that might match cut lines. I think the first is about a thumb-width to the right of the edge of the wafer, and attempts at connecting the dots between regions on the left half of the wafer show 6 or 7 such lines on that half.
Whatever's left of the leftmost division or above the topmost line would probably be waste.
I think it's about 6.5-7 such columns for the left half of the wafer, and 4-4.5 rows in the top half.

I'm presuming the radius of the wafer is 150mm, so this sort of squinting + MS paint gives ~22mm x ~33mm rectangles with massive error bars.
That's a big die, if those lines were the last word. There is a periodicity to those bright horizontal lines, so something is repeating at least that often, but they might not be a die boundary or not the only one in that direction.
Perhaps this can be cut more in the vertical direction more, as there's more horizontal regions that I cannot tease out the significance of, versus the more distinct vertical lines. If for example the vertical regions are two dies whose other cut is indistinct, it would be a more modest though still substantial die.
It's possible the visual fidelity is so poor that compression artifacts or my eye are introducing lines where there aren't any or hiding others, however.

The multi die strategy will meet much more success in compute or AI, though at 500 watts it seems their architecture have terrible efficiency.
The story's table indicates the 400/500W board is a 4-tile device. On a per-tile basis that doesn't seem bad, depending on how much is in a tile.
That power requirement and 48V delivery does rule out consumer products. Perhaps an HPC or datacenter custom module?
 
It's lacking detail, at an angle and the reflection is pretty intense, but I think I see a few features.
There are horizontal bright bands at regular intervals, with the most obvious being right next to the thumb. I think the next repeats of this pattern are: about in the reflection of his glasses in the upwards direction, and above the base of the thumb in the downwards.
Repeating this pattern and trying to connect possible regions of similar appearance yields maybe two more such divisions in the upwards direction.

I can clearly see an advanced version of Primitive Shader unit with an incredible performance and High Bandwidth Cache Controller there. That´s Raja´s legacy. Just wait for unicorn drivers.
 
The Voodoo 5 5500 came with it's own power brick. A supercomputer set up with these cards will need its own dedicated power plant...

Seriously though. To play devils advocate a little: 500 watts for... what? Do we have any performance figures against which we can extrapolate performance/watt? Aside from some rather vague analysis of the DG1 SDV card shown at CES and assumptions?
 
The Voodoo 5 5500 came with it's own power brick. A supercomputer set up with these cards will need its own dedicated power plant...
No it didn't, it used standard 4-pin Molex. Voodoo 5 6000 (the one with 4x VSA-100) would have come with a power brick but most if not all of the cards have since been modded to take power from 4-pin Molex instead (it wasn't really THAT power hungry, the PSUs were just crappy at the time so they apparently couldn't trust getting enough 12V out from them, the Voodoo Volts power brick was capable of delivering measly 55,2 watts (12V 4,6A))
Seriously though. To play devils advocate a little: 500 watts for... what? Do we have any performance figures against which we can extrapolate performance/watt? Aside from some rather vague analysis of the DG1 SDV card shown at CES and assumptions?
500 watts for datacenters and the like, not consumers. DG1 can't be used for any analysis, as it's different chip on different microarchitecture (Xe-LP vs Xe-HP)
 
Dang that chip looks extremely rectangular... I would guess for extra pad area. Do we have any confirmation on what type of memory they are using?

My outline isn't accurate to the edge of the die, but a very rough outline of what I think is the die. Not sure if that square would be the center or on the edge of the die.

intel1.jpg intel2.jpg
 
Status
Not open for further replies.
Back
Top