So I've been trying to get some kind of rough idea of how the SM 6.4 version of XeSS might run on Series S. And yes, my entire hypothesis is based off of this one part of the DF video
:
View attachment 6939
The RTX 3070 is shown upscaling using XeSS from 1080p to 4K in 3.8 ms. With default clock it has specs of, I think:
- 20.3 TF fp32
- 10.2 TOPs int32
- (afaik) 40.6 non tensor TOPs int8 (I'm multiplying int32 x 4)
Xbox Series S has specs of:
- 4 TF fp32
- 4 TOPs int32
- 16 TOPs int8
So just going by the numbers (which is always risky!) you're looking at XSS being ~ 1/5th speed at fp32 and int32, and ~ 2/5th speed at int8. So depending on the balance of instructions used, and assuming the 3.8 ms represents all the cost of XeSS, you might expect the XSS to be 20% to 40% as fast in this scenario, upscaling from 1080p to 4K. So taking 9.5 to 19ms where the 3070 takes 3.8 ms.
If DP4a is leant on heavily in XeSS, then that might be expected to skew towards the better end of the range for Series S, as it is in less of a deficit for int ops than flops.
However, you probably wouldn't be trying to scale from 1080p to 4K on the Series S. Assuming that XeSS workload scales broadly linearly with resolution, upscales from 540p to 1080p (2.4 ~ 4.8 ms?) or even 720p to 1440p might be within reach.
All very highly speculative of course, so take with a big 'ol pinch of salt!