AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
It can't be L1 since you won't even get an integer number by dividing 128 by the number of CUs.
Moreover, L1 has enormous requirements for bandwidth and latency (less for GPUs though), it will cost an arm and leg in this case.

If AMD went for large L2 or L3 cache, I can imagine they did it to be competitive at least in some cases with Ray-Tracing.
As long as BVH fits in the cache, they should be good. Though, for long ranged rays and disperse rays, this might be PITA.
They will cut to the minimum BVH for games with reflections == fewer objects in reflections, smaller draw distances for reflections will be standard optimizations in AMD games.
I am expecting there will be quite a good performance when something fits nicely in the cache and in case of blending.
Although, there will be a lot of performance holes, so perf will vary wildly from case to case and title to title.
Maybe it has 128cus?
 
In theory with a 384 bit bus, you can get to 16 GB, with 8x 1GB and 4x 2 GB GDDR6 chips.
Though would not have much benefit over 12 GB, as the extra 4 GB is on a 128 bit bus.
Still very intrigued why they would go with 256 bit, where the XBX has 320 bit.

It's not the strangest idea - kind of like the GTX 970 and 3.5GB + 0.5GB. The vast majority of titles will fit happily into 12GB and have full and uniform memory bandwidth. In the rare situation where you overflow that 12GB, keeping the extra data in the (slower) remaining 4GB is still going to be a heck of a lot faster than swapping it back and forth over the PCIe bus to system memory.

Depending on the cost difference between 1GB vs 2GB GDDR6 it might make sense. Costs you nothing from a BOM or PCB design perspective, and also means you can release a cost-reduced 12GB card without making any physical changes to the PCB layout. The toughest part would be effectively marketing it.
 
That's why I underlined rumored. There's an old 505mm2 rumor and the latest 536mm2. Both could turn out to be complete BS.

Of course, but usually rumors have at least some grain of truth to them. And I'm sure none of us expect Big Navi to be around a 3080 at half the die size (different processes I know). It should be somewhere in that ballpark by all accounts.
In theory with a 384 bit bus, you can get to 16 GB, with 8x 1GB and 4x 2 GB GDDR6 chips.
Though would not have much benefit over 12 GB, as the extra 4 GB is on a 128 bit bus.
Still very intrigued why they would go with 256 bit, where the XBX has 320 bit.

Theoretically they could even do 16 GB with a 320 bit interface just like the XSX. But there must be some downsides to it, or we'd have seen more of these asymmetric memory configurations over the years. Heck Nvidia could have given the 3080 12 GB and totally avoided the "10GB is less than 11GB 1080/2080Ti" criticism.
Let me be the pessimistic one and say those numbers are from their high-end offering. Have we forgotten "Poor Volta" already?

And that performance is more than good enough for the majority. If it's akin to RV770 then it will win based on the price alone. The Radeon HD 4850 and 4870 were dirt cheap for the performance they offered.

RDNA2 supposedly has a less complex PCB (only 256-bit memory bandwidth) and uses GDDR6 (cheaper).

For some reason $700 graphic cards have been normalised and talked about as being midrange.

I don't think most people would refer to $700 card as midrange for sure. Up until Pascal, we had a reasonable price to performance ratio from both parties, and with better perf at each price point with every generation. Of course with Turing it stagnated and this is where we saw the $700 graphics cards being "normalised" as you say. AMD's GPUs at the time could not compete beyond the upper mid-range and relegated the high end to Nvidia. We saw the "mid-range" moving from a $199 GTX 960 to a $249 GTX 1060 to a $349 GTX. One has to keep in mind inflation and price of silicon (on a $/mm2 basis) has been going up rather significantly so the prices going up is not just pure profiteering.

And FWIW, I think AMD underpriced RV770 (not that I'm complaining, I happily bought a HD4850 to replace my 8800GT at the time), and could have easily priced it a little higher and made some more money. It's unlikely they will repeat this. As evidenced by Zen 2 and more so with Zen 3, AMD will price at a premium if they can.
It's not necessarily same process on same production lines at TSMC. And AMD can't sell four chiplets for 1600 USD to same guy, but they can sell two chiplets and $700 GPU to one guy for 1500 USD

(yes, I know there's Threadrippers, but the point should be obvious)

Unless navi is on 7nm+, they are both going to be on the same 7nm process. Even otherwise, its more about the total wafer allocation AMD has secured and they are competing against other players as well obviously.

Aside from Threadripper, there is also of course EPYC. With Milan, AMD have a very strong product at a time Intel has badly faltered. With Icelake server reportedly delayed and underwhelming, AMD has to make hay until Sapphire Rapids in late 2021/early 2022. To add, AMD is also experiencing record demand for their APUs at the moment. All of these will certainly drive their wafer allocation more towards CPUs/APUs.
If we follow that logic, there's no reason for AMD to produce anything except EPYCs and Threadrippers as the frequencies there are much more closer to the V/F sweetspot (variation in silicon quality is less pronounced), margin is far greater and so on. There's also the "professional GPU" market - former Quadro/FireGL (or how are they called now) cards are usually getting sold for thousands of dollars or even 10k+ dollars (as in case of RTX 8000).

IIRC Nvidia once said that the R&D for the professional parts is paid by the consumer parts. The professional lineup would not be able to sustain itself on a standalone basis, or at least it couldn't at that point (circa 2016-2017). Today perhaps Nvidia might be able to survive on professional alone, but gaming is still a majority of their revenue. It's also good to diversify your revenue sources obviously. If AMD had decided to focus only on Opteron back in the Athlon 64 days, they'd likely have died out by mid 2015 without a consumer line to keep them going as their server market share plummeted. Either ways, the current supply situation is likely short term and exacerbated due to the console ramp for the launches. It should ease by the next quarter with the capacity vacated by Huawei and Apple.
 
Last edited:
It's not the strangest idea - kind of like the GTX 970 and 3.5GB + 0.5GB. The vast majority of titles will fit happily into 12GB and have full and uniform memory bandwidth. In the rare situation where you overflow that 12GB, keeping the extra data in the (slower) remaining 4GB is still going to be a heck of a lot faster than swapping it back and forth over the PCIe bus to system memory.
You could even use the slower part of the memory as a streaming buffer, if you have enough copy engines and they work really asynchronously.

The toughest part would be effectively marketing it.
My guess would be drivers.
 
It's not the strangest idea - kind of like the GTX 970 and 3.5GB + 0.5GB. The vast majority of titles will fit happily into 12GB and have full and uniform memory bandwidth. In the rare situation where you overflow that 12GB, keeping the extra data in the (slower) remaining 4GB is still going to be a heck of a lot faster than swapping it back and forth over the PCIe bus to system memory.

Depending on the cost difference between 1GB vs 2GB GDDR6 it might make sense. Costs you nothing from a BOM or PCB design perspective, and also means you can release a cost-reduced 12GB card without making any physical changes to the PCB layout. The toughest part would be effectively marketing it.
Not the same as 970, 970 was a mess and you had to either access 3.5GB or 0.5GB, couldn't access both at the same time or something along those lines. NVIDIA has used 1+2GB chips on some of their lower end models though and there's no reason AMD couldn't, but it's really unlikely for highend IMO
 
Maybe it has 128cus?
160?

b3da043.png


As I posted earlier:
Navi 10 with 160 CUs would be about that size (10240 ALU lanes), with 4 shader engines using up about 389mm². So one could ask, why did AMD not go with 160 CUs?
 
How much memory bandwidth would you need to feed 160 CUs? If you believe Navi 10 is a reasonably balanced design, then 4x of 5700xt is 1.8GB/s.
 
I'm thinking with the newest HBM2E, a 128 or even 160 cu GPU should have enough bandwidth..

Probably wouldn't even need 1.8 TB/sec, maybe just 384-bit GDDR6 / 6X and still having the large cache.
 
According to RedGamingTech, the numbers we saw for Big Navi in the Zen 3 keynote was either the 64 CU or the 72 CU part, i.e. not the full die.
 
If it was 72cu, I don't believe that a 80cu chip will be a lot faster (without pushing the frequency) ... But if it was a 64cu, and a 80cu is planned too, this can be fun.
 
Last edited:
Still very intrigued why they would go with 256 bit, where the XBX has 320 bit.
I suspect the XSX only actually needs that bandwidth for XB1X BC.
- the XB1X had 326GB/s of bandwidth. (vs a paltry 218GB/s on the PS4 pro)
- a focus of the console (or PR focus anyway) is 'better BC' - doubling title performance to 60/120fps etc.
- if RDNA2 is using cache/compression to make up for bandwidth, then that probably isn't going to provide much of a reliable boost in BC mode.
 
Status
Not open for further replies.
Back
Top