Sorry, yes model not hardware.
The point being "2-years ago" Microsoft was SHOWING what is capable with their up coming API. Attach that to Microsoft's own Hardware (via AMD) and the new fp8 fp16 tricks the xbox is pulling off.
Then the mention of MS using their cloud to train models in a near instant.. for their own ecosystem.. puts them beyond what SONY could manage. MS knows this so will (have been) leveraging this.
There is a reason MS just updated Windows10 recently for directX12U and new video cards coming. right?
well DirectML is a low level API to running machine learning, in which most libraries are extremely slow. It's comparable to CUDA-X AI in that sense. It contains some very similar functions. What you saw with DirectML and the showcase around it, is that it is vendor agnostic API that is able to run a machine learning model with very low overhead for faster processing times.
The model is the one actually doing the work of upsampling the picture from 1080p to 4K.
Nvidia does the same thing, except that they have CUDA likely doing this work instead of DirectML.
The model is just the model that they've trained.
Sony can manage to build an API that would support this, or borrow one from Vulkan (not sure if they have a low level API for this type of stuff yet) but it's not the hard part in this process.
The model creation is.
As simple as some may look at ML training as straight forward, like some Udemy course. It's not. To do what Nvidia has, provided you have the talent to understand how to create a very fast and light weight model with very good quality, you've got to build the training set and labels to support it. You've also need to know precisely where your model will be leveraged and that specificity is also what makes this harder to implement. You will also run into other hard restrictions as this is a realtime NN, mainly memory size and processing time. You need the best possible NN to take up the smallest amount of VRAM and run in the shortest time possible.
So its not straight forward at all, and building that model can take tons of time or very little depending on what you have as resources. Building an AI that does AntiAliasing and Upscale Resolution is fairly trivial at this point in time for the field. Getting it done in mere milliseconds as opposed to many seconds/minutes is what separates nvidia from the rest.
MS cloud cannot do this type of training in near instant. The size of the models and training corpus is likely to be massive. There are a load of engineering problems when you attempt to train stuff that is way larger than your video memory permits. It's not as simple as saying the cloud. The reason why Nvidia charges so much for their actual ML hardware addresses some of these challenges. Even then, they can take days to train from scratch. When your iteration time is so slow, and the cost of running a fresh train is thousands of US dollars per run in cost of electricity, you better be damn willing to do this and find a way to profit from it.
I'm not saying it's doom and gloom, and I'm not saying MS isn't looking into it. I'm just saying, there's nothing reported as of yet. And until we actually get some real news, the expectation right now is that no one else is really invested on working on this except nvidia. They have a great deal work in the computer vision space where they provide solutions to companies for real time NN processing. Audi and Tesla AI driving for instance is such a thing (prior to Tesla moving to their own FPGA solution)