I have never seen their solution out in the public space. Typically in our space, if you're going to run for high performance AI computing, you're going to either use a library or run an optimizer that will go through your network and reduce the weights of the nodes as required. I don't think you can just peek at the DLSS2.0 weights. I haven't tried or seen anything on this domain at least, but I can only suspect the weights to range from int4 to fp32. And that will change on the quality settings and how much you need to upscale by.
CUDA cores also supports mixed precision on RTX cards. BUt I don't know if they go as low as int4 or even int8. But they support FP16 - > FP32 mixed.
You can always run any network using FP32 hardware as long as it doesn't require FP64
I couldn't not tell you the gain from reducing weights as it's going to differ for each network. The idea that you can turn a FP32 network and drop all weights to int4 while retaining a high degree of accuracy is unlikely, and there is a high probability of overflow errors. So you should expect there to be a variation of weighs across the network, but only the people developing it would know how much is saved by the reductions.