Half-precision redeemed ?

Ronaldo8 · Feb 1, 2021

We already know about MS adding support for 16 bits and 8 bit operations in their silicon, ostensibly to accelerate inference work loads. There is now an interesting patent filed by Xbox ATG member Ivan Nevraev titled "Acceleration of shader programs by compiler precision selection". (https://uspto.report/patent/app/20200380754)
As per patent claims, a "precision lowering manager" is tasked with analysing shader code, evaluate impact of loss of precision and perform conversion to lower precision if possible....automatically. If true, dynamic and fine-tuned compilation of shader programs can be achieved. In short, this a true gamechanger.

iroboto · Feb 2, 2021

We typically do this already for NN models. When you export the model, you can run, or some libraries run optimizers that reduce the precision at different nodes to improve performance where the loss acceptable. Mixed precision networks currently are currently becoming the new standard as the trade off for precision (very little loss) for performance (massive gains) makes it desirable.

I don't think this is anything new as a technology as it's quite the norm for some time now, but I suppose it may be new to shader programs, and only because this generation of GPUs support mixed precision math and is now a valid thing to do.

Frenetic Pony · Feb 2, 2021

iroboto said:
We typically do this already for NN models. When you export the model, you can run, or some libraries run optimizers that reduce the precision at different nodes to improve performance where the loss acceptable. Mixed precision networks currently are currently becoming the new standard as the trade off for precision (very little loss) for performance (massive gains) makes it desirable.

I don't think this is anything new as a technology as it's quite the norm for some time now, but I suppose it may be new to shader programs, and only because this generation of GPUs support mixed precision math and is now a valid thing to do.

It's already been used for some things in games, technically the memory format is supported even if they don't run any faster on older hardware; so it's still useful in some specific cases where memory is really tight.

But with the new consoles I expect it to become standard use for anything that fits in within the range and issues with precision don't come up. Plenty of screenspace tricks are prime candidates for this. Less memory pressure and twice the execution speed are great, and while MS scotched the PS4 Pro's support back in the day, it was really just FUD and the fact that devs weren't going to go particularly out of their way to optimize one part of their engine for one sub type of one console.

Ronaldo8 · Feb 2, 2021

Frenetic Pony said:
It's already been used for some things in games, technically the memory format is supported even if they don't run any faster on older hardware; so it's still useful in some specific cases where memory is really tight.

But with the new consoles I expect it to become standard use for anything that fits in within the range and issues with precision don't come up. Plenty of screenspace tricks are prime candidates for this. Less memory pressure and twice the execution speed are great, and while MS scotched the PS4 Pro's support back in the day, it was really just FUD and the fact that devs weren't going to go particularly out of their way to optimize one part of their engine for one sub type of one console.

The non-console war and obviously true reason for rapid packed math being initially shrugged off is that its use require a case by case evaluation of the gain in performance and its counterpart loss in precision. There are thousands of shader programs in your average AAA game. 'Ain't anybody got time to optimise at this level of granularity on a large scale. What was required was a way to automatically identify code that can be executed at half-precision with an acceptable loss in quality (using an iterative process) and provide appropriately edited code to the compiler in an automatic fashion with as little time/effort investment from the developer as possible....which happens to be exactly what the patent is about.

Frenetic Pony · Feb 2, 2021

Ronaldo8 said:
The non-console war and obviously true reason for rapid packed math being initially shrugged off is that its use require a case by case evaluation of the gain in performance and its counterpart loss in precision. There are thousands of shader programs in your average AAA game. 'Ain't anybody got time to optimise at this level of granularity on a large scale. What was required was a way to automatically identify code that can be executed at half-precision with an acceptable loss in quality (using an iterative process) and provide appropriately edited code to the compiler in an automatic fashion with as little time/effort investment from the developer as possible....which happens to be exactly what the patent is about.

I mean, there's a lot more low level optimization in high end games than this gives credit for. Building it into the compiler isn't necessary for such optimization to happen. As stated, it literally already does even when the benefit is tiny.

But an automatic optimization is neat none the less.

Ronaldo8 · Feb 3, 2021

Frenetic Pony said:
I mean, there's a lot more low level optimization in high end games than this gives credit for. Building it into the compiler isn't necessary for such optimization to happen. As stated, it literally already does even when the benefit is tiny.

But an automatic optimization is neat none the less.

This topic is not about any other myriad of optimisation that may arise in game development except adaptive compilation via half-precision. Computation at lower precision has enabled huge gains in performance for deep neural networks and is now possible for shaders.

iroboto · Feb 3, 2021

It's a nice touch, but right now just a patent. It's unsure if MS has provided the tools or methods that would optimize shaders in this way yet.

Without talk about it being implemented or what gains are expected, it's really just a patent and nothing yet fruitful. It's good alignment with what's happening in the industry however.

Ronaldo8 · Feb 3, 2021

Members of the Xbox Advanced Technology group are not in the business of filing patents for open ended research/future possibilities. Up to now, all patents (that I know of) by Nevraev and Co have been linked to immediate concrete implementation (SFS, VRS, hybrid ray-tracing, index buffer compression, texture compression/reconstruction using trained NN). I fully expect this solution to be deployed on xbox consoles/PC. We shall see if it suceeds.

iroboto · Feb 3, 2021

Ronaldo8 said:
Members of the Xbox Advanced Technology group are not in the business of filing patents for open ended research/future possibilities. Up to now, all patents (that I know of) by Nevraev and Co have been linked to immediate concrete implementation (SFS, VRS, hybrid ray-tracing, index buffer compression, texture compression/reconstruction using trained NN). I fully expect this solution to be deployed on xbox consoles/PC. We shall see if it suceeds.

ehhh

I have friends that file patents for MS. So I wouldn't be so quick to say they aren't in the business of doing things. They most certainly do a lot of research and development, and when fruitful enough they file a patent. Sometimes not fruitful enough to make it all the way to production however. It's just the way it is. It does seem like something they should be able to do however, but usually without a news article or something, it does just feel like our typical patent diving. Sometimes it happens, most times it does not.

BRiT · Feb 3, 2021

iroboto said:
I have friends that file patents for MS. So I wouldn't be so quick to say they aren't in the business of doing things.

I think their point was the department those patents are coming from tend to have them actually being used.

I couldn't say one way or another without analyzing the patents by department, but then how do you know they're actually being used?

iroboto · Feb 3, 2021

BRiT said:
I think their point was the department those patents are coming from tend to have them actually being used.

I couldn't say one way or another without analyzing the patents by department, but then how do you know they're actually being used?

You don't unfortunately, it's just the hit rate isn't 100% if that makes sense. From an observation standpoint the patent is sound, and I think someone can look at existing programs in the deep learning space to get an idea of how they could implement it in the shader space. But that's about as similar as the two applications are, end of the day, shaders aren't neural networks. So the approach of downgrading precision to improve bandwidth and speed is going to be very different than the approach to accomplish this for NN.

So it sounds very good on paper, but as does DLSS and AI upres of textures etc, AI animation, AI behaviour. But implementations is another issue to solve. Even with DLSS being out there, and running the same hardware for instance, we still don't see a generic non Nvidia DLSS competitor even though all the knowledge is out there to do it.

Allandor · Feb 3, 2021

Lower precision can safe some cycles in some situations. But the ps4 pro had two problems here:
1. Code had to run on ps4, too. So it is extra work for the developer to fine-tune
2. The more work can be done by the processor the more you get out of it.
E.g. a hypothetical 1ghz processor can only run so many operations. If you safe a few cycles, you might run a few more operations but nothing to really speak off. But if you have a much more capable processor you might still get the same percentage of saved cycles, but now you have so many saved, that you can actually do something with those.
Also the more capable processor runs more code in parallel, so saving efficiency might even increase, but it will never reach the theoretical max.

Ronaldo8 · Apr 20, 2021

HLSL shader model 6.6 introduces new intrinsic functions for processing packed 8-bit data to "reduce bandwidth usage where lower precision calculations are acceptable".
https://devblogs.microsoft.com/directx/in-the-works-hlsl-shader-model-6-6/
https://devblogs.microsoft.com/directx/hlsl-shader-model-6-6/

Half-precision redeemed ?

Ronaldo8

iroboto

Daft Funk

Frenetic Pony

Ronaldo8

Frenetic Pony

Ronaldo8

iroboto

Daft Funk

Ronaldo8

iroboto

Daft Funk

BRiT

(>• •)>⌐■-■ (⌐■-■)

iroboto

Daft Funk

Allandor

Ronaldo8