Intel ARC GPUs, Xe Architecture for dGPUs [2018-2022]

Cyan · Feb 22, 2022

Malo said:
Did you look at the tweets at all, or just go from the text? Look at the image, it's clearly one of the two recent games, it looks like Shadow.

seen the image, yes, but I can't recognise the game by that screengrab alone. You might have a point though, given the fact that I haven't finished the modern Tomb Raider games yet, the one where I advanced the most was Rise of the Tomb Raider when it was on pc gamepass, but I didnt complete it yet

Dayman1225 · Feb 23, 2022

Intel released a roadmap for their dGPUs at their investor day:

They also said they expect to ship 4million+ units this year for Arc Alchemist

xpea · Mar 9, 2022

First DG1 independent review by a reliable site is out:
https://www.igorslab.de/en/gunnir-i...ab-and-searching-for-the-fitting-microcode-2/

Deleted member 2197 · Mar 10, 2022

Dayman1225 · Mar 24, 2022

Intel at GDC has talked about how their XeSS works and their Raytracing hardware in Alchemist

You can watch the sessions here:
https://www.intel.com/content/www/us/en/events/developer/gdc-march-2022.html?videoid=6301519388001

https://videocardz.com/newz/intel-x...re-five-quality-modes-including-ultra-quality

trinibwoy · Mar 24, 2022

Dayman1225 said:
Intel at GDC has talked about how their XeSS works and their Raytracing hardware in Alchemist

You can watch the sessions here:
https://www.intel.com/content/www/us/en/events/developer/gdc-march-2022.html?videoid=6301519388001

https://videocardz.com/newz/intel-x...re-five-quality-modes-including-ultra-quality

Wow Intel has some really strong opinions on RT architecture and they mostly seem to be saying that AMD is doing it all wrong.

Don’t do BVH traversal on vector units
DXR 1.0 style PSOs are better than DXR 1.1 ubershaders doing inline RT

While Intel’s approach is very similar to Nvidia they’re doing some things differently that could give them a big advantage. SIMD execution is only 8-wide when running RT meaning fewer and shorter stalls compared to Nvidia’s 32-wide warps. Intel is also sorting rays after each bounce which would further reduce losses due to divergence.

Alchemist could make things really interesting!

PSman1700 · Mar 24, 2022

trinibwoy said:
Wow Intel has some really strong opinions on RT architecture and they mostly seem to be saying that AMD is doing it all wrong.

So far, RDNA3+ or afterwards might see a similar approach.

DavidGraham · Mar 24, 2022

trinibwoy said:
Don’t do BVH traversal on vector units

DXR 1.0 style PSOs are better than DXR 1.1 ubershaders doing inline RT

Everyone and their mother knows AMD's approach is minimalistic at best. And their DXR1.1 approach is used only do very modest RT effects that rely on a lot of screen space methods.

Newguy · Mar 24, 2022

Having hardware coherency sorting means they're at "level 4" RT hardware straight out the gate? Per Imgtech's definition, vs level 2 for AMD and 3 for Nvidia, that'll be very interesting to see how well it benchmarks even if games are still using 32 wide waves instead of their recommended 8. Going for the widest matrix math units and more advanced RT hardware (on paper at least) vs the competitors is one way to introduce yourself into the market, could be exciting times in the next 3-5 years for GPUs

DegustatoR · Mar 24, 2022

PSman1700 said:
So far, RDNA3+ or afterwards might see a similar approach.

Do we have any information which say that RDNA3 will be any different to RDNA2 in how it runs RT workloads?

Newguy said:
New Having hardware coherency sorting means they're at "level 4" RT hardware straight out the gate?

Turing had ray sorting I believe.

PSman1700 · Mar 24, 2022

DegustatoR said:
Do we have any information which say that RDNA3 will be any different to RDNA2 in how it runs RT workloads?

We dont, am just guessing that AMD will follow with NV and Intel are doing.

troyan · Mar 24, 2022

DavidGraham said:
Everyone and their mother knows AMD's approach is minimalistic at best. And their DXR1.1 approach is used only do very modest RT effects that rely on a lot of screen space methods.

DXR1.1 is used in Minecraft and Metro:Exodus EE. But nVidia hasnt any problem with DXR1.1. Interessting that Intel find DXR1.0 more usefull.

trinibwoy · Mar 24, 2022

troyan said:
DXR1.1 is used in Minecraft and Metro:Exodus EE. But nVidia hasnt any problem with DXR1.1. Interessting that Intel find DXR1.0 more usefull.

Nvidia and Microsoft have said the same thing. AMD is the outlier in claiming DXR1.1 style RT is better.

TopSpoiler · Mar 24, 2022

Newguy said:
Having hardware coherency sorting means they're at "level 4" RT hardware straight out the gate?

No. Intel does not saying anything about ray sorting. It's a shader execution grouping.

Newguy said:
Per Imgtech's definition, vs level 2 for AMD and 3 for Nvidia

Nvidia doing coherence gathering in their TTU at each traversal step, so it's actually level 4.
https://www.freepatentsonline.com/11157414.html
Imagination didn't know that because patent is not published at that time yet.

trinibwoy · Mar 25, 2022

TopSpoiler said:
No. Intel does not saying anything about ray sorting. It's a shader execution grouping.

That's true. Ray sorting improves shading efficiency and has the added benefit of speeding up ray traversal too. Intel seems to be more concerned about shading divergence than traversal though.

nAo · Mar 26, 2022

trinibwoy said:
That's true. Ray sorting improves shading efficiency and has the added benefit of speeding up ray traversal too. Intel seems to be more concerned about shading divergence than traversal though.

They claim their traversal HW is essentially MIMD, so execution divergence is not an issue. BVH and triangle data divergence could still be a problem though.

Lurkmass · Mar 26, 2022

trinibwoy said:
Wow Intel has some really strong opinions on RT architecture and they mostly seem to be saying that AMD is doing it all wrong.

Don’t do BVH traversal on vector units

DXR 1.0 style PSOs are better than DXR 1.1 ubershaders doing inline RT

While Intel’s approach is very similar to Nvidia they’re doing some things differently that could give them a big advantage. SIMD execution is only 8-wide when running RT meaning fewer and shorter stalls compared to Nvidia’s 32-wide warps. Intel is also sorting rays after each bounce which would further reduce losses due to divergence.

Alchemist could make things really interesting!

Intel HW has fixed function dynamic dispatch via BTD (bindless thread dispatch) and their hardware doesn't differentiate between ray generation/any hit/closest hit/miss/intersection shaders so virtually all ray tracing shaders are callable shaders which might be a fairly unique setup specific to their HW ...

On AMD HW, all ray tracing shaders are just compute shaders so it's not a coincidence that they recommend doing inline RT with compute shaders to get the highest performance. The danger behind ubershaders in general is the increased register pressure so combining it with inline RT may negatively impact performance but you trade divergent dispatch between distinct shaders for divergent execution within a shader ...

trinibwoy · Mar 27, 2022

Lurkmass said:
Intel HW has fixed function dynamic dispatch via BTD (bindless thread dispatch) and their hardware doesn't differentiate between ray generation/any hit/closest hit/miss/intersection shaders so virtually all ray tracing shaders are callable shaders which might be a fairly unique setup specific to their HW ...

On AMD HW, all ray tracing shaders are just compute shaders so it's not a coincidence that they recommend doing inline RT with compute shaders to get the highest performance. The danger behind ubershaders in general is the increased register pressure so combining it with inline RT may negatively impact performance but you trade divergent dispatch between distinct shaders for divergent execution within a shader ...

AMD is dealing with the double whammy of divergent traversal and divergent shading on the SIMDs. Their current approach is really only appropriate for very trivial RT scenarios. I can’t imagine they will stick to it for RDNA3.

Intel is doing MIMD traversal and are giving the driver/hardware an opportunity to mitigate divergent dispatch. There really isn’t anything AMD’s driver can do to help with divergence within the shader and they're leaving it up to developers to figure out. It’s not quite a trade off as Intel’s approach seems to be objectively more robust for real RT use cases.

Lurkmass · Mar 27, 2022

trinibwoy said:
AMD is dealing with the double whammy of divergent traversal and divergent shading on the SIMDs. Their current approach is really only appropriate for very trivial RT scenarios. I can’t imagine they will stick to it for RDNA3.

Intel is doing MIMD traversal and are giving the driver/hardware an opportunity to mitigate divergent dispatch. There really isn’t anything AMD’s driver can do to help with divergence within the shader and they're leaving it up to developers to figure out. It’s not quite a trade off as Intel’s approach seems to be objectively more robust for real RT use cases.

With developers, I would not speculate too much on exactly what they're going to do. In the worst case scenario for Intel HW, developers could very well choose to ignore their advice and start hardcoding a wave size of 32 with inline RT as the common instance ...

A possible argument from AMD and others is that dynamic dispatch is the bigger evil compared to divergent SIMD lane execution. Merging similar shaders might very well be more practical compared to incurring the overhead of function calling. Divergent shading starts becoming a bigger issue if you start merging dissimilar shaders which shares little to no codepaths among each other ...

Dayman1225 · Mar 27, 2022

Intel has released a terminology update for Xe-HPG/HPC. Essentially renaming parts of the GPU to “better represent their functionality”

https://www.intel.com/content/www/u...s/technical/gpu-terminology-for-intel-xe.html

Also leaks of a Pro series mobile line coming “soon”

https://twitter.com/x/status/1508087907441065991

Intel ARC GPUs, Xe Architecture for dGPUs [2018-2022]

Cyan

orange

Dayman1225

xpea

Deleted member 2197

Guest

Dayman1225

trinibwoy

Meh

PSman1700

DavidGraham

Newguy

DegustatoR

PSman1700

troyan

trinibwoy

Meh

TopSpoiler

trinibwoy

Meh

nAo

Nutella Nutellae

Lurkmass

trinibwoy

Meh

Lurkmass

Dayman1225

Similar threads