Machine Learning: WinML/DirectML, CoreML & all things ML

could you be more specific?
Sure

So I can haz a question : Take for example Nvidia DLSS (Deep Learning Super Sampling) from what I understand my p.c is using an algorithm created on Nvidia super computers using ML
but when if ever will the learning take place on my computer (eg: I use some future version of DLSS and the more I use it the better it gets because its learning ?

Never, not current technologies anyway. It doesn't learn on your computer, it gets taught by NVIDIA and you just run it.
 
Doing some training is not impossible on PC, though it depends on how much training you want to do.

For example, the smallest Llama-2 (a LLM) is 7B parameters and you can do inference with it on many PC GPU without a problem (or even some high end CPU). However, it took Meta 184,320 GPU hours (they used A100-80GB) to train that model. So if you have a 4090 and assuming that it performs as well as a A100 80GB (probably won't, although 4090 is likely faster but its memory is much smaller), it's still going to take 21 years to do the training.

Of course, there are many researches on increasing training performance and I'm sure there will be more training performed locally (e.g. a vacuum bot can learn the floorplan of your house), but right now it's not done locally in most case.
 
The problem with local learning is that it will lead to locally divergent results which wouldn't necessarily be great. Learning is not a simple process where the NN automatically gets better with learning by itself, it requires guidance. When it comes to visual results in games that's not something many people would want to do.
 
In the future I can see a form of hybrid training/inference functionality where the personal, sensitive data is maintained locally while more generalized data input (possibly pre-trained results, algorithms) is captured from remote sources on the internet (possibly a public library, npo, UN or similar). Instead of starting from scratch everytime, having the capability to run your local data and use remote source inputs (data elements/hints, algorithms) is where I see the trend heading, though most cloud source providers will likely prefer your personal data stored on their servers as data input for their own LLMs.
 
The problem is that most AI researchers are lazy :) There are likely extreme efficiency gains possible, but that requires some fundamentally different architectures both for accelerators and models.

From passing observation, BASED needs a fraction of the weights in the attention layer, Bitnet and TernGrad can do almost all math with binary ops and int8 adds (even if you use ternary weights, the ops will be binary, no need to make ternary operators), LLM in a Flash and Low-Rank Lottery tickets show a way forward to require only a fraction of the MLP/linear weights memory resident both during training and inference, with a fraction of memory bandwidth etc. But if you have a huge pool of H100s you can also just make a tweaked transformer like everyone else ...

I think Apple is the most likely to break from the pack, it's beneath them to buy H100s.
 
Doing some training is not impossible on PC, though it depends on how much training you want to do.
My question was more why does the training situation on the P.C differ from that of the PS5?
you cant use ML to perform upscaling on the P.C the learning has to be done on supercomputers but it's not the case with the PS5?
 
Last edited:
My question was more why does the training situation on the P.C differ from that of the PS5?
you cant use ML to perform upscaling on the P.C the learning has to be done on supercomputers but it's not the case with the PS5?
It's the exact same case everywhere.
 
In the future I can see a form of hybrid training/inference functionality where the personal, sensitive data is maintained locally while more generalized data input (possibly pre-trained results, algorithms) is captured from remote sources on the internet (possibly a public library, npo, UN or similar). Instead of starting from scratch everytime, having the capability to run your local data and use remote source inputs (data elements/hints, algorithms) is where I see the trend heading, though most cloud source providers will likely prefer your personal data stored on their servers as data input for their own LLMs.

You already have a "hybrid" approach currently with "fine tuning" being done locally along with inference.


The problem with local learning is that it will lead to locally divergent results which wouldn't necessarily be great. Learning is not a simple process where the NN automatically gets better with learning by itself, it requires guidance. When it comes to visual results in games that's not something many people would want to do.

I'm not so sure. Modding itself isn't the majority but a significant amount of people do mod to really change the game for the better but just to tune it to their own preferences often away from the so called developers vision. Not everyone necessarily needs to do their own training either just like most people who mod aren't actually creating their own mod.

Granted I don't think current implementations in games really leaves much room for significant variation.
 
My question was more why does the training situation on the P.C differ from that of the PS5?
you cant use ML to perform upscaling on the P.C the learning has to be done on supercomputers but it's not the case with the PS5?
They're not doing training on the PS5, they're using the already trained model and accelerating that with the new hardware on the PS5 pro. Same thing with DLSS and XeSS - the hard work getting all the data and training is done behind the scenes, then the games use that trained model and the end user's hardware to accelerate it for the "inference" step (i.e. the relatively easy part). If they are training them all on PS5s then good luck to them, they're a brave bunch
 
They are tiny models, rendering the supersampled 4K game sequences might take more compute than the training.
 
If they are training them all on PS5s then good luck to them, they're a brave bunch
If they are not why did they mention the machine learning performance of the ps5 ? seems strange? Nvidia didnt mention the ML performance of Lovelace wrt DLSS and Intel didn't mention the ML performance of Arc wrt XeSS
 
If they are not why did they mention the machine learning performance of the ps5 ? seems strange? Nvidia didnt mention the ML performance of Lovelace wrt DLSS and Intel didn't mention the ML performance of Arc wrt XeSS

With how the term is used machine learning performance is beneficial for inferencing as well.

Not sure exactly what you mean with Nvidia and Intel but they do mention machine learning hardware for DLSS and XeSS respectively?


NGX employs the Turing Tensor Cores for deep learning-based operations andaccelerates delivery of NVIDIA deep learning research directly to the end-user. Features includeultra-high quality NGX DLSS (Deep Learning Super-Sampling)

Dedicated machine learning hardware built into Intel® Arc™ graphics products, and AI algorithms enable our neural network to optimize image quality.
 
NVIDIA Digital Human Technologies Bring AI Game Characters To Life has some game-related news, discussing ways to create more lifelike game characters. It includes this Covert Protocol demo trailer to show some results. Word is: "To accelerate developer adoption, Inworld will be releasing Covert Protocol’s source code in the near future, enabling developers to learn from the tech demo, and use it to create their own innovations." Here's more:
At GDC 2024, Inworld and NVIDIA collaborated on a new technology demo called Covert Protocol to showcase NVIDIA ACE technologies and the Inworld Engine. In the Covert Protocol tech demo, you act as a private detective who completes objectives based on the outcome of conversations with characters in the scene. Covert Protocol unlocks social simulation game mechanics with AI Digital Humans acting as custodians of crucial information, presenting challenges, and catalyzing key narrative developments. Digital humans play a critical role in the experience.

Powered by the Inworld Engine and leveraging NVIDIA ACE, each player's journey through Covert Protocol is unique. Players’ real-time decisions and strategic planning lead to different game outcomes, ensuring that no two playthroughs are alike. This level of AI-driven interactivity and player agency opens up new possibilities for emergent gameplay, where players must think on their feet and adapt their strategies in real-time to navigate the intricacies of the game world.
 
I know it's quite an obvious thought but it's cool to see it come to fruition thanks to that video (1:38 on)


Said it before but imagine with synthesised voices improving (that'll be on the roadmap to implement you'd think) and matching them up with ML facial animations - very small games could be voice acted somewhat competently, relatively quickly and cheaply, which could be amazing for story and character heavy games. Sound effects too? Imagine writing prompts for all the sounds you could ever want and sliders adjusting things, or possibly exporting those sound files into

Getting ahead of myself here but we're still at the beginning of the next big S curve for many different things and hopefully that'll mean significant positive progress
 
I just don't understand how this field gets so much attention. Ohh 900 fps for statically lit point cloud rendering, it's as useless a geometry primitive as it ever was for practical rendering.

So, it works nicely with scene reconstruction from photographs. That's cute, by which I mean mostly useless.
 
I just don't understand how this field gets so much attention. Ohh 900 fps for statically lit point cloud rendering, it's as useless a geometry primitive as it ever was for practical rendering.

So, it works nicely with scene reconstruction from photographs. That's cute, by which I mean mostly useless.
I'm pretty optimistic about NeRF.
Of course there are challenges that need to be addressed before it can be used in interactive applications. For example dynamic lighting and physics interaction must be possible to be used in games. But even at its current level it can be useful for content creation.
 
This isn't really NeRF in a rendering sense, it's still Gaussian splats but with better optimization of the point cloud.

Meta's model from Deep Appearance Prefiltering seems closer to a practical rendering primitive to me.
 
Back
Top