Machine Learning to enhance game image quality

Discussion in 'Console Technology' started by Alucardx23, Mar 23, 2017.

  1. Alucardx23

    Regular

    Joined:
    Oct 7, 2009
    Messages:
    519
    Likes Received:
    66
    I want to dedicate this thread to how machine learning will help to greatly improve image quality at a relatively low performance cost. We can start with Google's RAISR. Here are some claims from Google:

    -High Bandwidth savings
    "By using RAISR to display some of the large images on Google+, we’ve been able to use up to 75 percent less bandwidth per image we’ve applied it to."

    -So fast it can run on a typical mobile device
    "RAISR produces results that are comparable to or better than the currently available super-resolution methods, and does so roughly 10 to 100 times faster, allowing it to be run on a typical mobile device in real-time."

    -How it works
    "With RAISR, we instead use machine learning and train on pairs of images, one low quality, one high, to find filters that, when applied to selectively to each pixel of the low-res image, will recreate details that are of comparable quality to the original. RAISR can be trained in two ways. The first is the "direct" method, where filters are learned directly from low and high-resolution image pairs. The other method involves first applying a computationally cheap upsampler to the low resolution image and then learning the filters from the upsampled and high resolution image pairs. While the direct method is computationally faster, the 2nd method allows for non-integer scale factors and better leveraging of hardware-based upsampling.

    For either method, RAISR filters are trained according to edge features found in small patches of images, - brightness/color gradients, flat/textured regions, etc. - characterized by direction (the angle of an edge), strength (sharp edges have a greater strength) and coherence (a measure of how directional the edge is). Below is a set of RAISR filters, learned from a database of 10,000 high and low resolution image pairs (where the low-res images were first upsampled). The training process takes about an hour."

    [​IMG]

    Comments:
    We are talking about a neural network that learns the best way to upscale images, based on a data base of thousands of compared images at different resolutions. As an example, you have developer X trying to develop a game that has a target of 1080P/60fps on the PS4 hardware, in theory you could let a neural network compare a bunch of images for hours/days of your game, running at 720P VS 1080P, and it will get better finding the best custom upscaling method to simulate a 1080P image, based on a 720P framebuffer.

    I have seen several examples of AA methods that work wonders on one game, but don't work as good on others, since a lot has to do with the game aesthetics. This means that with this method every game can have their own custom AA filters that no other "One size fits all AA technique" can compete with at the same performance level.

    RAISR Upscaling examples:
    [​IMG]

    [​IMG]

    [​IMG]

    Source material:

    Saving you bandwidth through machine learning

    https://blog.google/products/google-plus/saving-you-bandwidth-through-machine-learning/

    Enhance! RAISR Sharp Images with Machine Learning

    https://research.googleblog.com/2016/11/enhance-raisr-sharp-images-with-machine.html
     
    #1 Alucardx23, Mar 23, 2017
    Last edited: Aug 1, 2017
  2. Alucardx23

    Regular

    Joined:
    Oct 7, 2009
    Messages:
    519
    Likes Received:
    66
    Magic Pony is another company that uses neural networks to improve image quality, but they are more focused on video.

    Artificial Intelligence Can Now Design Realistic Video and Game Imagery
    https://www.technologyreview.com/s/...-now-design-realistic-video-and-game-imagery/

    "The company has developed a way to create high-quality videos or images from low-resolution ones. It feeds example images to a computer, which converts them to a lower resolution and then learns the difference between the two. Others have demonstrated the feat before, but the company is able to do it on an ordinary graphics processor, which could open up applications. One example it’s demonstrated uses the technique to improve a live gaming feed in real time."

    Example of video stream improvement:
    [​IMG]

     
    Karamazov likes this.
  3. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    Wow, it'd be worth it in a browser for blurry pictures of text / pictures of blurry text alone.
     
    Alucardx23 likes this.
  4. Squeak

    Veteran

    Joined:
    Jul 13, 2002
    Messages:
    1,262
    Likes Received:
    32
    Location:
    Denmark
    I'm thinking about the possible application to old SD content. If you could do it live you'd never lose any information.
     
  5. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    40,706
    Likes Received:
    11,156
    Location:
    Under my bridge
    If it works well enough on video, you could broadcast at a quarter of the resolution, allowing for less compression and greater overall clarity. Imagine YouTube videos actually looking good!
     
    OCASM and Prophecy2k like this.
  6. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    10,972
    Likes Received:
    5,794
    Location:
    London, UK
    As somebody does performs daily "magic" via a server farm, my first question is: how much computational power is required to do this in realtime? This may well be a solution for a one-run time over old SD footage in order to make it available to all in HD, how does this work in realtime with millions of clients?
     
    Prophecy2k, JPT and BRiT like this.
  7. Prophecy2k

    Veteran

    Joined:
    Dec 17, 2007
    Messages:
    2,467
    Likes Received:
    377
    Location:
    The land that time forgot
    My question is more about the "learning" part, like how much processing resource is required to do this in a time reasonable enough to allow for real-time implementation in a videogame?

    Would the entire process not introduce far too much latency?
     
    JPT likes this.
  8. Esrever

    Regular Newcomer

    Joined:
    Feb 6, 2013
    Messages:
    594
    Likes Received:
    298
    The learning part would be trained before hand by whoever makes the software, that would take probably weeks. I think currently, training takes anywhere between 10000 to 100000 images for image recognition software depending on how accurate you need it to be which would probably be true for this as well, this can be done in development time. The inference + rendering part is currently possible to do in under a second but even that would probably be too much for a low latency real time system like a game. I think right now, if you try to do this on a GPU, it would cost a lot more performance than directly rendering the image at higher resolution. There are dedicated hardware solutions coming out to do things like this like Google's TPU. These are much more efficient that GPUs at the specific calculations and Google claims 10x efficiency. We might see dedicated silicon for processing machine learning code in the future inside PCs and Consoles.
     
    milk likes this.
  9. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Google's 10x claim is versus K80. K80 is a Kepler based GPU launched in 2012. Everybody knows that Kepler wasn't the best GPU for compute. Also Maxwell added double rate fp16 and Pascal added 4x rate uint8 operations. Googles TPU is doing uint8 inference only.

    This is Nvidia's recent response to Google's TPU claims:
    https://www.extremetech.com/computi...nge-googles-tensorflow-tpu-updated-benchmarks

    Vega is going to support double rate fp16 and 4x rate uint8 operations. It will be interesting to see whether some games start to use machine learning based upscaling / antialiasing techniques, as consumer GPUs soon have all the features requires for fast inference. Nvidia is currently limiting double rate fp16 and 4x rate uint8 to professional GPUs. Intel has double rate fp16 already on all consumer grade iGPUs (not sure about 4x rate uint8).
     
    Alucardx23 likes this.
  10. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Most of the neural networks do not learn while doing their job (called inference). Common way is to train the network first and then use it. You would store a trained network in the game package (disk). At runtime you only do inference (no training).
     
    Alucardx23 and iroboto like this.
  11. Prophecy2k

    Veteran

    Joined:
    Dec 17, 2007
    Messages:
    2,467
    Likes Received:
    377
    Location:
    The land that time forgot
    Thanks for the responses gents, but I must admit I'm still lost...

    How do you train the neural network the difference between two images that haven't even been generated yet?

    If the image, i.e. each frame of the videogame (i.e. final display frame) is only generated at runtime, then what are you using to train the neural network with beforehand?

    I feel like I'm missing something important here.
     
  12. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Neural network works a bit like the human brain. You can read text, recognize a logo, recognize a car or a building or a friend you know when you watch them from different angles, different distances or in different lighting conditions. You still recognize variations of the same logo easily, even at first time. You don't need to learn every single case separately. Very important thing in training is to avoid over-fitting. Over-fitting means that the network can only recognize exactly the training set. Instead you want more generic network that can recognize things that share properties and patterns with the training set. The network is trained with lots of different data to ensure that the network can figure out generic rules and patterns instead of just detecting a few examples. After training, you test the network with another training set that hasn't been used in training to ensure that it gives the right results.

    For example a line antialiasing network could learn how to estimate exact lines from grid of 1/0 values. This network could learn common patterns of neighborhoods to calculate the exact position and direction of the line at each point. It is important that the network learns generic rules & patterns instead of remembering every single image. This allows both smaller networks (less neurons) and makes them more generic (applicable to images not in the training set).
     
    Alucardx23 and iroboto like this.
  13. Prophecy2k

    Veteran

    Joined:
    Dec 17, 2007
    Messages:
    2,467
    Likes Received:
    377
    Location:
    The land that time forgot
    Sebbbi, thanks for the explanation. I think where I was getting stuck was in understanding both exactly what the neural networks were being trained to see, and also the scope of what they would be used to do when being applied on a new image.

    Your antialiasing example is actually a great example that really made it click in my mind.

    I'm definitely interested to see the kind of results this produces.

    I wonder whether existing GPUs are actually necessarily the best-suited hw for this kind of application? Perhaps someone can come up with some fixed-function hw to bolt onto the end of a GPU to accelerate these kinds of inference-based techniques?
     
  14. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    New 8/16 bit packed math instructions are a big improvement. But you are most likely right. In the long run GPUs will be replaced by ASICs. However GPUs capable of fast inference will be available in consumer devices in a few month. Neural network ASICs will be integrated in various consumer electronics, such as digital cameras, but I doubt we see general purpose ASICs soon in PCs and consoles. But gaming devices always have GPUs. I am certain that games will use GPUs to run simple inference tasks pretty soon. First it will be like "Hairworks" and "VXGI", but eventually scale down.
     
    Alucardx23 and Prophecy2k like this.
  15. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    The 4x rate uint8 operations may be a bit murky on consumer Pascal as initial reports/reviews mention GTX 1080 with no int8/dp4a, but Scott Gray tested this on the Nvidia dev forums and found it was fully supported with performance of around 33-36 Tops, he also proved the behaviour of FP16.
    This sort of makes sense because the Tesla P4 (professional segment) is GP104 like the GTX1070/GTX1080 and also supports int8/dp4a.
    But maybe this is a CUDA thing *shrug*.

    Cheers
     
  16. Alucardx23

    Regular

    Joined:
    Oct 7, 2009
    Messages:
    519
    Likes Received:
    66
    [​IMG]

    "Nvidia researchers used AI to tackle a problem in computer game rendering known as anti-aliasing. Like the de-noising problem, anti-aliasing removes artifacts from partially-computed images, with this artifact looking like stair-stepped “jaggies.” Nvidia researchers Marco Salvi and Anjul Patney trained a neural network to recognize jaggy artifacts and replace those pixels with smooth anti-aliased pixels. The AI-based solution produces images that are sharper (less blurry) than existing algorithms."

    Nvidia uses AI to create 3D graphics better than human artists can
    https://venturebeat.com/2017/07/31/...te-3d-graphics-better-than-human-artists-can/


    I think that we can start to make the prediction that AI dedicated hardware will become a common thing for games.

    Intel puts Movidius AI tech on a $79 USB stick
    https://www.engadget.com/2017/07/20/intel-movidius-ai-tech-79-dollar-usb-stick/

    "With the Compute Stick, you can convert a trained Caffe-based neural network to run on the Myriad 2, which can be done offline. Ultimately, the device will help bring added AI computing power right to a user's laptop without them having to tap into a cloud-based system. And for those wanting even more power than what a single Compute Stick can provide, multiple sticks can be used together for added boost."

     
    #16 Alucardx23, Aug 1, 2017
    Last edited: Aug 1, 2017
    eloyc and Aaron Elfassy like this.
  17. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    40,706
    Likes Received:
    11,156
    Location:
    Under my bridge
    @nAo!
    Not in that example image. That's very blurred.
     
    milk and bunge like this.
  18. Alucardx23

    Regular

    Joined:
    Oct 7, 2009
    Messages:
    519
    Likes Received:
    66
    It has like a 200X Zoom. :p
     
  19. milk

    Veteran Regular

    Joined:
    Jun 6, 2012
    Messages:
    2,986
    Likes Received:
    2,558
    it doesn't look much better than any old -and blurry- post-process aa.
    EDIT: ...so far. I hope they continue research.
     
  20. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,719
    Likes Received:
    5,815
    Location:
    ಠ_ಠ
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...