Intel ARC GPUs, Xe Architecture for dGPUs [2018-2022]

Status
Not open for further replies.
I imagine it would definitely compete in RT games, somewhat.
 
Intel need to have a footprint on the market and build driver support with gaming industry on top of that. Probably miners will give that footprint to Intel, since it doesn't need that much driver integration like games. Gamers mostly expect Intel to be competitively priced so that they can buy AMD or Nvidia cards for cheaper.
 
Intel need to have a footprint on the market and build driver support with gaming industry on top of that. Probably miners will give that footprint to Intel, since it doesn't need that much driver integration like games. Gamers mostly expect Intel to be competitively priced so that they can buy AMD or Nvidia cards for cheaper.

Nah, nowadays gamers will buy Intel GPUs if miners don't take them all, and have sufficient performance + stock.
 
Do you really expect a 256b GDDR6 GPU to not be used for mining?
I half-expect Intel to create hard and soft limitations to hash rate. Considering the Xe's development started around the time the first crypto boom occurred, they'd be dumb not to think of some safeguards IMO.
 
All Nvidia GPUs starting with Kepler have quad rate Int8 I believe.
DP4A is not the same as supporting packed 8-bit integer math of some kind, it's a specific instruction doing 8x8->16bit multiplication and saturating accumulation into a 32-bit integer (supporting various combinations of signed and unsigned operands). Presumably DP4A and DP2A (counted as a single instruction, not a quad/double rate operation) run at the same rate as either 32-bit integer or float instructions.

I can't find much reference to DP4A/DP2A being supported on anything other than Pascal GP102/104/106. On later chips it might run on tensor cores, it might run at float rate or at int32 rate, or it might be emulated via bit shifts/masks and int16 multiplications.


https://developer.nvidia.com/blog/mixed-precision-programming-cuda-8/
"For such applications, the latest Pascal GPUs (GP102, GP104, and GP106) introduce new 8-bit integer 4-element vector dot product (DP4A) and 16-bit 2-element vector dot product (DP2A) instructions."

https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html#int8
"GP104 provides specialized instructions for two-way and four-way integer dot products. These are well suited for accelerating Deep Learning inference workloads. The __dp4a intrinsic computes a dot product of four 8-bit integers with accumulation into a 32-bit integer. Similarly, __dp2a performs a two-element dot product between two 16-bit integers in one vector, and two 8-bit integers in another with accumulation into a 32-bit integer. Both instructions offer a throughput equal to that of FP32 arithmetic."
 
Apologies for the slight OT, but does Nvidia state anywhere which GPUs support DP4A at what rate?
AFAIK all GPUs starting with GP104 (so all desktop Pascal chips) support this on Nv's side.
The rates though can differ depending on how said support is provided on chips with tensor cores.
Would be cool if anyone would write a throughput benchmark now when both DX and VK support the feature.
 
Status
Not open for further replies.
Back
Top