AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
In fact, going through it, Modern Warfare is just 50% faster and Gears just 65% faster. It seems severely unlikely that this is an 80cu part at all, let alone one hitting 2.2ghz.
There's no built in benchmark for Modern Warfare, you can get comparable numbers only for the Borderlands and Gears benches which do feature standard bench (which we can assume AMD used)
 
Tensor cores are just a specific ASIC that does a select number of matrix operations that could be done with generic stream processors, there's absolutely no barrier to implementing the same algos to do DLSS / KnlmeansCL or whatever DL denoising / upscaling method without it. Also, I'm not sure if you actually need inference for using such techniques as apparently the training is done on nVidia's side. So, in my opinion, it's just another form of vendor lock-in a-la PhysX that for some reason many tech enthusiasts welcome.
 
And 3090 shows you what they can do by unlocking anything on 3080 - it's +10% at best due to power constraints.
There are likely to be GA102s that can't quite make 3090 specification, but will have enough SMs functional so that all 7 GPCs can be used. 20GB or 18GB will also save some power, compared with 24GB, though that power saving will be pretty small.

AMD has to decide how much of the +13% of 3090 versus 3080 it wants to capture. 5700XT is 14% faster than 5700XT which is achieved by a combination of 10% higher "boost clock" (or 8% higher "game clock") and 11% more WGPs. This would require AMD to have shown "72 CU" performance yesterday. While I think that's possible (and a 2.2GHz boost clock coupled with 72 CUs is theoretically 2.1x faster than 5700XT) I am dubious.

RDNA2's memory bandwidth efficiency, due to the 256-bit bus rumours, is a complete wildcard (and mucks up theoretical comparisons with 5700XT). If it's a radically new architecture, then we can assume it's going to take AMD about a year to work out how to make it run well...

If AMD decides to go for 3090 with 6900XT (XTX? water-cooled XTX?), it seems likely that power consumption will be very high. The 3080 20GB is rumoured to launch in early December. NVidia will have about a month to decide how hard this card fights back. So AMD has to balance launch price for 6900XT against its demand.

Ryzen 3 has demonstrated that AMD will not be shy about raising prices, so 6900XT is likely to follow that pattern.

There has also been a rumour of a water-cooled 6900XTX but not this year, so AMD may not attack 3090 this year.

Finally, another rumour says that big Navi will only come with the 3-fan cooler. The 2-fan cooler that was shown a while back presumably isn't cool enough for any variant of big Navi.

So AMD looks likely to be able merely to match 3080 on launch day but pushing past that appears as if it's going to be difficult. Which gives NVidia plenty of chance to spoil it.
 
You're welcome to try and make a DLSS like approach run without tensor cores with the same quality and performance. Until you do your opinion here is highly dubious.
To be honest, I don't get why peculiarities of DLSS are discussed here. In any case, I can refer you to the recent tests done by phoronix with ncnn upscaling algo - there can be implementations that can be both faster or slower on specific generic computing units or specialized hardware depending on the algorhithm, memory bandwidth limitations and other things.

Also, as it seems you are very keen to defend DLSS, maybe you can tell us why it's better to spend this die space on a specific ASIC that can be used in a very limited number of scenarios (in gaming) instead of adding more ROPs, CUs and the other parts of the rendering pipeline. It almost looks like it was added for some other reason and then Huang et al. had to invent the usecase to warrant it's presence :)
 
Also, as it seems you are very keen to defend DLSS, maybe you can tell us why it's better to spend this die space on a specific ASIC that can be used in a very limited number of scenarios (in gaming) instead of adding more ROPs, CUs and the other parts of the rendering pipeline. It almost looks like it was added for some other reason and then Huang et al. had to invent the usecase to warrant it's presence :)

Forward progress. First comes hardware then software follows. Thats the reason why AMD has to battle rasterizing, raytracing and DL "games" today.
 
Also, as it seems you are very keen to defend DLSS, maybe you can tell us why it's better to spend this die space on a specific ASIC that can be used in a very limited number of scenarios (in gaming) instead of adding more ROPs, CUs and the other parts of the rendering pipeline. It almost looks like it was added for some other reason and then Huang et al. had to invent the usecase to warrant it's presence
Huang is making billions of this added die area right as you're asking why it was added there. Does this answer the second question?
As for using these 10-20% die for more of what you've said - DLSS gives them what, +50-100% of performance? How much performance would 20% more "ROPs, CUs and the other parts" add?
I'm not "keen to defend DLSS" but it's funny to see how the same crowd both wants DLSS and doesn't want the tensor cores because of... reasons.
 
Ryzen 3 has demonstrated that AMD will not be shy about raising prices, so 6900XT is likely to follow that pattern.

Zen 3 builds on established marketshare and mindshare. Thats why they raised the prices. Zen 1 disrupted the CPU market and introduced ”good enough” performance for less money. Radeon has not yet achieved this goal.

if we ignore all the noise from rabid fanboys and tech youtubers making up rumors for clicks, we have the CEO stating their goal with RDNA2 is to disrupt the 4K gaming market. How do you do this if Radeon as a brand is percieved as worse than Geforce? Price is an important factor as is performance
 
DLSS would give you 0 performance if you don't have Nvidia's support, I assume u have to paid for it in advance like everything with Nvidia. Either way DLSS is not the only image up-scaling in the world, nor is the absolute best ever created. AMD introduced Radeon Sharpening that works pretty well with very little performance penalty without having to have 20% of the die idle in most games...I assume they will have a new method derived from the new consoles in Big Navi.
 
Forward progress. First comes hardware then software follows. Thats the reason why AMD has to battle rasterizing, raytracing and DL "games" today.
First comes PR, you mean :) As I said, there's a piece of hardware that is essentially not really needed, you have to invent reasons to have it there (and get PR machine going with "just buy it" and other shenanigans that we witnessed in the recent years). Pretty sure many gamers honestly believe Jen-Hsun and pals invented raytracing :)

Huang is making billions of this added die area right as you're asking why it was added there. Does this answer the second question?
It was self-evident. The question is, again, WHY do we need it?

DLSS gives them what, +50-100% of performance?
As if there were no other implementations of upscaling before this (including ones that do not require training specific DNNs/CNNs for each task with inevitable fringe cases ruining the IQ). Again, how its existence is related to the dedicated ASICs added to the hardware? You could replace them with universal CUs that could also do int4/int8/fp16 operations (again, are tensor cores actually used for DLSS? no one really researched that, afair) that could do the same operations and be used for something else (fp+int CMT in Ampere is an example of such sort of unification). If we look at another specific ASIC, RTX, improvements allegedly done to the RT cores basically were negated by the fact that memory/cache bw did not improve by much (not as bad as in P100, for instance, but it's now close to that)

I'm not "keen to defend DLSS" but it's funny to see how the same crowd both wants DLSS and doesn't want the tensor cores because of... reasons.
Yeah, that shows :) Me, personally, as I hinted earlier, would happily dispense with both DLSS and hardware based RT and move them where they really belong - professional 3D content creator or big data segment.
 
This is absymal marketing and failed cliffhanger by the textbook. Not saying this couldn't happen (cough Vega cough), and we should keep our hope low, but I thought RTG had moved on from this crap after R... Vega.
Vega was over 3 years ago. Teasing to death even when you don't have a competing product was Raja's method (which BTW he's still doing.. when Intel claims "up to 1.8x" the performance of Renoir just to fall short in actual gaming performance).
With Navi 10, AMD's marketing was already pretty different.

See, that would make sense if the 3070 were a mystery. But it's not. Specs, performance tier and price are known. There's nothing left to adapt. I also don't buy this narrative that Nvidia is leaving room for price-adjustment within 1 day.
1 day is plenty. If e.g. the RX 6800XT has close-to-RTX3080 performance and is priced closed to or the same $500 as the RTX3070, then the former might get DOA.
 
AMD has had their presentation locked for probably weeks by now, if not months.
I don't think you know how these things work in reality. Last minute changes are not uncommon.

Where do you think many leakers get their info from? Why do you think the slides have an NDA expiration date in the first place? You think marketing team can make them, get them approved by higher ups and distribute them all around the world in 2 weeks? Come on...
Leakers get their info from a bunch of places, including Linux drivers, industry contacts etc. No presentation is needed for that.

Slides have an expiration date to avoid unwanted leaks. There are multiple reasons for this. One of them definitely is to hide the fact that they can change things at the last minute. If they do change something and everyone knows about it, it gives the impression of the company being chaotic or disorganized. Look how people viewed AMD when they dropped the price of the 5700XT, AFTER it was announced. Some people thought AMD looked weak. And do you really honestly truly think that if they can change things after an announcement, that they cannot do it before...? Let's be real bro.

Distribution around the world is real easy nowadays. All you have to do is send an e-mail.
 
Last edited:
Borderlands 3 and Gears 5 have no DXR support.

Not at this exact moment but both BL3 and Gears5 are using the updated UE4 engine for their PS5/XbX versions with Gears5 confirmed to be using DXR for certain effects.
https://www.theverge.com/2020/9/12/...x-series-x-next-gen-upgrade-4k-60-splitscreen
https://www.vg247.com/2020/03/16/xbox-series-x-gears-5-ray-tracing/

And Nvidia said they worked with IW to add DXR to COD:MW.
https://nvidianews.nvidia.com/news/...aytracing-on-pc-powered-by-nvidia-geforce-rtx

So... while I admit it is extremely unlikely, they could have been showing off Big Navi performance with the unreleased versions of those games with DXR effects on.
 
https://www.pcworld.com/article/3585090/amd-radeon-rx-6000-big-navi-performance-tease-rtx-3080.html

Most importantly, Herkelman stressed that AMD didn’t state which Radeon RX 6000 graphics card ran these benchmarks. We don’t know whether these results come from the biggest Big Navi GPU, or a more modest offering. (Herkelman also said there’s still fine-tuning left to do before launch.)

Ok, if they're not sandbagging I'll be disappointed.


Here's the table from chiphell, pointed out earlier in this thread BTW:

vuZkKCM.png




auFEiCz.jpg
 
It was self-evident. The question is, again, WHY do we need it?
Feel free to look into ongoing research on using ML in gaming if that's what you're asking. DLSS is just one application here, it's just the most impactful one at the moment.

As if there were no other implementations of upscaling before this (including ones that do not require training specific DNNs/CNNs for each task with inevitable fringe cases ruining the IQ).
I'm not aware of any which is even close to what DLSS is able to accomplish - in shipping games to boot.

Again, how its existence is related to the dedicated ASICs added to the hardware? You could replace them with universal CUs that could also do int4/int8/fp16 operations (again, are tensor cores actually used for DLSS? no one really researched that, afair) that could do the same operations and be used for something else (fp+int CMT in Ampere is an example of such sort of unification).
You could, with a loss of about 3/4 of performance which would also make it impractical for real world applications. Which is how its existence is related etc.

If we look at another specific ASIC, RTX, improvements allegedly done to the RT cores basically were negated by the fact that memory/cache bw did not improve by much (not as bad as in P100, for instance, but it's now close to that)
They weren't. That HWU "analysis" did a hell of a lot of damage through the public which is willing to believe any bad news about NV h/w.

Me, personally, as I hinted earlier, would happily dispense with both DLSS and hardware based RT and move them where they really belong - professional 3D content creator or big data segment.
You are aware that 3D acceleration has started in professional 3D content creator segment too and that it would never came to PC gaming if not for people who decided the add this h/w into their PC GPUs, right?
What you're saying about DLSS and RT has the exact same merit in itself.
It's also funny to see such takes now, when ALL GPU vendors confirmed the addition of RT h/w in their future architectures. But apparently you know better how to design GPUs than NV+AMD+Intel combined.
 
Feel free to look into ongoing research on using ML in gaming if that's what you're asking. DLSS is just one application here, it's just the most impactful one at the moment.
Actually, I read papers on machine learning regularly (not gaming related though, but also related to image upscaling and denoising, I even offered an example of a ML denoising method that runs just fine with generic OpenCL hw) and I have a fairly extensive experience with regards how much of this fluff is the real deal and what is typical snake oil and voodoo magic.

'm not aware of any which is even close to what DLSS is able to accomplish - in shipping games to boot.
Currently there's a whole lot of TWO games with DLSS 2.0 support. If you expect that it'd be universally supported, go ahead, but let me keep my doubts about that. The green team fans also waited for every game to have PhysX around 10 years ago, does not seem to have panned out :)

You could, with a loss of about 3/4 of performance which would also make it impractical for real world applications. Which is how its existence is related etc.
Again, pray tell us how the inference ASIC acceleration is actually used for running the DNNs/CNNs, not training them.

They weren't. That HWU "analysis" did a hell of a lot of damage through the public which is willing to believe any bad news about NV h/w.
Who? You probably don't understand what I'm talking about, it's about how cache bandwidth and SMs changed in P100 and then Volta.

You are aware that 3D acceleration has started in professional 3D content creator segment too and that it would never came to PC gaming if not for people who decided the add this h/w into their PC GPUs, right?
I thought that games like quake appeared first and then the dedicated accelerators that could do the inverse sine, hardware blitting and other stuff faster than the CPUs of that time sprang into being. But I guess my reading of history is wrong :)
 
Status
Not open for further replies.
Back
Top