personally I would rather have NVidia or AMD work with the engine dev's so that the deep learning is done by the user land shader code rather then some black box driver call. bc not only is it more performant and more accurate, but it also opens the door for things like dynamic resolution...
exciting stuff, the next step would be to get the dlss requirements IE "the motion vectors and sys call" embedded into the dx 12 api so every compliant dx12 game can use deep learning, without having the programmer code the diver call bc the api generates the motion vectors implicitly.
My guess is when engineers are no longer able to shrink the transistor, is when we will see a big decline in discreate gpu's. at this point APU's may become preferment enough that the added un-core performance of shared memory, cache coherency, ect, out way the raw performance you get with...
I am glad intel is also catering to the low end, it sucks that NVidia and AMD don't make any "decent" sub 75 watt cards any more. I wish they did because, I still game at 1080p and would like to build a passively cooled e-sports machine for the living room!
hi, I may be a humble compiler dev, but I still don't understand why Microsoft defined the DXR BVH to be so transparent? they could have have defined a fixed radix factor for the tree and a maximum tree depth for the supported hardware. from my understanding, this would lead absolutely insane...
its new for them, and interestingly enough this may be the first implementations of unified memory in consumer space, since Intel and Amd apu's segment their gpu//cpu memory with all the assess restrictions imposed by windows api that *mostly* forbid sharing data between cpu and gpu.
Pardon my ignorance, but I fail to see how the AMD 6800xt beats the NVidia 3080 considering that the 6800xt has a peak performance of 20.74 TFLOPS vs the 3080 29.77 TFLOPS. the difference is almost 10 teraflops wide!
I know I am late but, I want to chime in on the latency vs bandwidth debate. I believe that not only would a large LLC would not only provide more bandwidth at lower latency, but it it could also help AMD relax the amount of latency hiding per cu. for example, NVidia runs 32 threads per shader...
I work as a system programmer for a small hpc sever system, when we upgraded our system to use accelerators early in the decade we went with cuda because opencl at the time required a lot of proprietary extensions in order to vector-ize anything defeating the purpose of being open source in the...
it is a full node shrink so 50% more sees realistic, but i also expect a lot of the performance to be obtained as a frequency boost rather then 50% more cores.
as you said most implementations of TAA @ 30hz is bad, but as you scale up the frame rate not only do the TAA ghosting artifacts become less noticeable they are also considerable smother, so something like 120hz would be the killer feature that TAA games need next gen.