AMD FSR upscaling

  • Thread starter Deleted member 90741
  • Start date
D

Deleted member 90741

Guest
Say hello to AMD FSR

20210150669 : GAMING SUPER RESOLUTION

Abstract
A processing device is provided which includes memory and a processor. The processor is configured to receive an input image having a first resolution, generate linear down-sampled versions of the input image by down-sampling the input image via a linear upscaling network and generate non-linear down-sampled versions of the input image by down-sampling the input image via a non-linear upscaling network. The processor is also configured to convert the down-sampled versions of the input image into pixels of an output image having a second resolution higher than the first resolution and provide the output image for display

[0008] Conventional super-resolution techniques include a variety of conventional neural network architectures which perform super-resolution by upscaling images using linear functions. These linear functions do not, however, utilize the advantages of other types of information (e.g., non-linear information), which typically results in blurry and/or corrupted images. In addition, conventional neural network architectures are generalizable and trained to operate without significant knowledge of an immediate problem. Other conventional super-resolution techniques use deep learning approaches. The deep learning techniques do not, however, incorporate important aspects of the original image, resulting in lost color and lost detail information.

[0009] The present application provides devices and methods for efficiently super-resolving an image, which preserves the original information of the image while upscaling the image and improving fidelity. The devices and methods utilize linear and non-linear up-sampling in a wholly learned environment.

[0010] The devices and methods include a gaming super resolution (GSR) network architecture which efficiently super resolves images in a convolutional and generalizable manner. The GSR architecture employs image condensation and a combination of linear and nonlinear operations to accelerate the process to gaming viable levels. GSR renders images at a low quality scale to create high quality image approximations and achieve high framerates. High quality reference images are approximated by applying a specific configuration of convolutional layers and activation functions to a low quality reference image. The GSR network approximates more generalized problems more accurately and efficiently than conventional super resolution techniques by training the weights of the convolutional layers with a corpus of images.

https://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=/netahtml/PTO/srchnum.html&r=1&f=G&l=50&s1="20210150669".PGNR.&OS=DN/20210150669&RS=DN/20210150669
 
What methods do they imply they are using?

They talk about neural network resolution, they use AI like DLSS if I understand well

The devices and methods include a gaming super resolution (GSR) network architecture which efficiently super resolves images in a convolutional and generalizable manner. The GSR architecture employs image condensation and a combination of linear and nonlinear operations to accelerate the process to gaming viable levels. GSR renders images at a low quality scale to create high quality image approximations and achieve high framerates. High quality reference images are approximated by applying a specific configuration of convolutional layers and activation functions to a low quality reference image. The GSR network approximates more generalized problems more accurately and efficiently than conventional super resolution techniques by training the weights of the convolutional layers with a corpus of images.
 
They talk about neural network resolution, they use AI like DLSS if I understand well
[0008] and [0009] made it unclear to me since they specifically talk about the problems with deep learning and other generalized ML approaches. They also mention a wholly learned environment. Curious to know if/how this differs from DLSS in practice. AMD GPUs don't have nearly the matrix math capability of Nvidia GPUs. But I also don't know if that is the bottleneck for DLSS performance.
 
[0008] and [0009] made it unclear to me since they specifically talk about the problems with deep learning and other generalized ML approaches. They also mention a wholly learned environment.

But they spoke of training after. The problem with the current method is not taking into account the non linear information.

EDIT: This is in RDNA 3 speculation but AMD told it will be available for RDNA 2 GPU, Xbox Series and PS5. Maybe it will work with RDNA 1 GPU too?
 
upload_2021-5-20_11-9-52.png
upload_2021-5-20_11-10-13.png

They downsample the image at 302 (1/2 image resolution), and run two independent networks, linear and non linear, which afterwards feed back into a combined network
They could possibly configure/tweak the depths for the indpendent networks and avoid running all activation functions across the layers if it were combined and save some compute/time

Downsampling from current resolution also reduces the data set that is fed to the network most likely.
All in all just another way to model an ML problem. There is basically not more information
 
One thing strikes me is that you can add RNN in the mix in addition to the Linear (304) and CNN (306) provided the GPU have enough memory and horsepower and you can have some correction from temporal data points as well
For such an RNN only a short term memory is needed something like a Gated Recurrent Unit
Probably for RDNA3 and beyond.

One interesting tidbit from the patent

The deep-learning based non-linear upscaling network processes the low resolution image, via a series of convolutional operators and activation functions, extracts non-linear features, down-samples the features and increases the amount of feature information of the low resolution image.

Alternatively, when hardware does not support the processing in parallel, the linear upscaling processing and the non-linear upscaling processing are not performed in parallel.
 
One thing strikes me is that you can add RNN in the mix in addition to the Linear (304) and CNN (306) provided the GPU have enough memory and horsepower and you can have some correction from temporal data points as well
RNNs are heavy and store state in weights, there is no need in storing state in weights if you can feed it explicitly, i.e. feed 2 consequent frames to CNN at once.
It seems the thing described in the patent is just spatial upscaler, so there must be temporal part as well otherwise it wouldn't be able to converrge to higher res like DLSS does.
 
RNNs are heavy and store state in weights, there is no need in storing state in weights if you can feed it explicitly, i.e. feed 2 consequent frames to CNN at once.
It seems the thing described in the patent is just spatial upscaler, so there must be temporal part as well otherwise it wouldn't be able to converrge to higher res like DLSS does.
Feeding same frame means performing the computation again and it is not the principle of RNN
The result of the past activation is fed back to the next activation calculation
 
Last edited by a moderator:
RNNs are heavy and store state in weights, there is no need in storing state in weights if you can feed it explicitly, i.e. feed 2 consequent frames to CNN at once.
I don't think this is a useful way of thinking about RNNs, after all there is generally nothing preventing you from passing past inputs to the network. It's crucial that the RNN state is a learned distilled representation of significant features encountered in the recent past, you don't want to recalculate that from scratch every frame.
 
Feeding same frame means performing the computation again and it is not the principle of RNN
RNNs learn probability of the next event based on previous via the hidden state, there is no need for this in temporal image processing because two consequent frames are being explicitly aligned with motion vectors, so you would get nothing from RNN.
 
RNNs learn probability of the next event based on previous via the hidden state, there is no need for this in temporal image processing because two consequent frames are being explicitly aligned with motion vectors, so you would get nothing from RNN.
Image processing and NN are different things.
I am speaking about the model being able to predict the current image based on previous image
RNN is being used for video reconstruction outside of gaming.
https://ieeexplore.ieee.org/document/9098327
Whether it is feasible, I dont know. But something to consider while modelling, definitely.
 
Image processing and NN are different things.
Of cause they are, nobody argues about that.

I am speaking about the model being able to predict the current image based on previous image
Why would you want RNN for something like this?
Уou can simply store the previously upscaled high res images, warp it via motion vectors and then combine it with current low res image, that's how TAAU works.

RNN is being used for video reconstruction outside of gaming.
That's irrelevant for gaming. In video you don't have precise motion vectors, just a mere approximation - optical flow, on the other hand, there are no such time constrains for video processing as in gaming, so you can brute force some problems by throwing more math at it.
Other than this, RNNs would not help getting additional details. It's camera jittering that is adding individual details into every frame in games, it has nothing to do with RNNs.
 
Why is this in the RDNA3 thread though?

Seems to me that this is FSR which should be available for every DX12 architecture. The patent only mentions the presence of compute units made up of parallel SIMD units. They don't mention tensor cores, matrix multiply units or anything of the like.

simd.png




It also seems to be missing any kind of temporal data.
 
Why is this in the RDNA3 thread though?
Idk.
Rename DLSS thread into DLSS + FSR and paste all discussion there?
The patent is from 2019 though
No shit.
Stuff takes time to get outta the oven.
I have doubts about it being relevant to FSR.
Every time.
Oh well i ment a solution akin to DLSS/tensor cores
AMD isn't bolting MFMA engines to client GPUs.
but perfect for a RDNA3 speculation discussion though.
This has no relation to RDNA3 at all.
Not that the latter needs upscaling techniques at large.
 
Back
Top