AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

nutball · Oct 30, 2020

digitalwanderer said:
Can I get my hopes up a little or will they be crushed again, that's all I want to know. :/

You can, just so long as you don't jinx it for the rest of us

marifire · Oct 30, 2020

CarstenS said:
That's what it is.

yeah, i didn´t notice they already showed slimmer design.

Picao84 · Oct 30, 2020

A bit of a random question: if a card has 2 x 8pin connectors while it has a TGP of 220W-250W, do we really need to plug in 2 x 8pin? Could it work with 1 x 6 pin + 1 x 8 pin?

Kaotik · Oct 30, 2020

Picao84 said:
A bit of a random question: if a card has 2 x 8pin connectors while it has a TGP of 220W-250W, do we really need to plug in 2 x 8pin? Could it work with 1 x 6 pin + 1 x 8 pin?

Are people still actually using PSUs where PCIe leads have something else than 8 or 6+2 plugs?
Most likely the card will check that both are plugged in properly even if 8+6 would be enough for the power draw

tunafish · Oct 30, 2020

Picao84 said:
A bit of a random question: if a card has 2 x 8pin connectors while it has a TGP of 220W-250W, do we really need to plug in 2 x 8pin? Could it work with 1 x 6 pin + 1 x 8 pin?

Assuming the PSU can handle it, yes. The PCIe connector itself provides 75W, each 6-pin provides another 75W and the 8-pin connectors provide 150W. Slot power + 6-pin + 8-pin = 300.

(The two extra pins in an 8-pin connector really only exist to communicate the fact that the cables are thick enough to supply 150W, they are not really used.)

Kaotik said:
Most likely the card will check that both are plugged in properly

Yeah, you'd probably have to use a 6-to-8 adapter.

Picao84 · Oct 30, 2020

Kaotik said:
Are people still actually using PSUs where PCIe leads have something else than 8 or 6+2 plugs?
Most likely the card will check that both are plugged in properly even if 8+6 would be enough for the power draw

Yeah, I've just checked and it's 2 x 6+2. My bad. *facepalm*

tsa1 · Oct 30, 2020

In general, all power connectors have to be plugged into the GPU even if it does not really need it (i.e. my V56 Nitro+ LE has 3x8 pin, it never goes over 260W at stock settings, but it won't work without the third 8 pin).

Kaotik · Oct 30, 2020

tsa1 said:
In general, all power connectors have to be plugged into the GPU even if it does not really need it (i.e. my V56 Nitro+ LE has 3x8 pin, it never goes over 260W at stock settings, but it won't work without the third 8 pin).

Unless you got paperclips! Paperclips can easily make your 6-pin connectors 8-pin compatible :yes:

tsa1 · Oct 30, 2020

Kaotik said:
Paperclips can easily make your 6-pin connectors 8-pin compatible

Well, you can also use those 2xmolex => 8 pin cables, but they are very shoddy in general (and the PSUs which don't have the required connectors will probably die as my 600W, 9 years old FSP PSU did after a month with heavily OC'd Vega), so personally I'd not go with that

DmitryKo · Oct 30, 2020

andermans said:
The 256 MiB BAR has been exposed for a while already in Vulkan

Lurkmass said:
The Ryzen 5000/RX 6000 series (Zen 3/RDNA 2) enables the host to seamlessly stream data into any memory region on the device by removing this 256MB limit

Microsoft WDK documentation on GPUMMU states the 256 Mbyte limit is simply a default value set by device firmware to fit into 32-bit virtual adress space, and Vulkan is implemented as a user-mode driver which has to work through WDDM 2.0 kernel-mode driver (DXGK).
The PCIe standard supports BAR Size from 1 MB to 512 GB.

I don't think PCIe Resizable BAR is exclusive to RDNA architecture either; need to look into Linux driver code for a list of supporting hardware though.

vjPiedPiper said:
knowing that the host is PCI-E 4.0 and the GPU is PCI-E 4.0 ... might enable slightly different optimisations in the driver stack etc...

andermans said:
all non-driver allocated memory is CPU-cacheable and hence you need to enable cache-snooping for that memory on the GPU, which makes the GPU accesses to this memory slower. When you allocate using the driver you can choose for cacheable or non-cacheable. (see the 2 memory types for heap 1)

Yes, it's not about the bandwidth, it's about cache coherence.

If the GPU can use system memory just like its own local video memory (and vice versa, the CPU can use local video memory as if it was system memory), you have to either synchronise the GPU and CPU caches using some cache coherence protocol over PCIe bus - preferably something more complex than bus snooping - or completely disable caching for this physical memory pool, with a detrimental effect on perfromance.

Deleted member 2197 · Oct 30, 2020

Erinyes said:
Yea it's not anything proprietary, just a slight performance increase similar to smartshift in laptops. And AMD is obviously trying to incentivize people to buy their own CPU+GPU combinations, nothing wrong with it. The competition are free to do something of their own.

From a development standpoint would studios to need to change to specific CPU+GPU combinations? Or is it a "modify and forget" type code change not requiring any testing or validation?

Svensk Viking · Oct 30, 2020

tsa1 said:
Well, you can also use those 2xmolex => 8 pin cables, but they are very shoddy in general (and the PSUs which don't have the required connectors will probably die as my 600W, 9 years old FSP PSU did after a month with heavily OC'd Vega), so personally I'd not go with that

I would be hesitant to go with Molex for 8-pin, but I'd probably still try that rather than buying a new PSU.

That's the good part when you have old PCs that aren't worth a dime anymore, so you don't fear being risky with going against all recommendations. Back when I started putting together my own PCs 15 years ago, I too was always overpaying for the PSU and I didn't use adapters because everyone said it was so risky, but today I have used 10+ year old PSUs with 2xMolex to 6 pin (most power hungry was Core 2 Quad and HD 4870) without issues, so today I don't pay much mind towards my PSUs anymore.
And the few times my PSUs have died, the oldest one being a no-name brand from a Pentium 4 prebuilt, it didn't damage the rest of the hardware either.

Erinyes · Oct 30, 2020

pharma said:
From a development standpoint would studios to need to change to specific CPU+GPU combinations? Or is it a "modify and forget" type code change not requiring any testing or validation?

As per AMD it should provide some gains without any optimization from the developer end, but can provide even bigger gains if specifically optimized for. From what we've seen so far, it doesn't seem to be something than can't be enabled on Intel & Nvidia, so it might be possible on those platforms in the future. We should find out more information during the architecture deep dive around the launch.

pjbliverpool · Oct 30, 2020

DmitryKo said:
Yes, it's not about the bandwidth, it's about cache coherence.

If the GPU can use system memory just like its own local video memory (and vice versa, the CPU can use local video memory as if it was system memory), you have to either synchronise the GPU and CPU caches using some cache coherence protocol over PCIe bus - preferably something more complex than bus snooping - or completely disable caching for this physical memory pool, with a detrimental effect on perfromance.

So is that what SAM is actually doing then? Allowing each device to see the others memory pool as if it were it's own and keeping the caches between both CPU and GPU coherent? So this becomes similar to a UMA?

andermans · Oct 30, 2020

pjbliverpool said:
So is that what SAM is actually doing then? Allowing each device to see the others memory pool as if it were it's own and keeping the caches between both CPU and GPU coherent? So this becomes similar to a UMA?

No, the caching stuff was available before this already since forever (pre-GCN at least). The SAM change is that the CPU can access 100% of the GPU memory directly by resizing the BAR.

As an example that can already be enabled on many X399 (threadripper) boards, though not under the Smart Access Memory name, and maybe AMD did some driver optimizations to make better use of it in Direct3D 9/10/11.

LordEC911 · Oct 30, 2020

DmitryKo said:
Microsoft WDK documentation on GPUMMU states the 256 Mbyte limit is simply a default value set by device firmware to fit into 32-bit virtual adress space, and Vulkan is implemented as a user-mode driver which has to work through WDDM 2.0 kernel-mode driver (DXGK).
The PCIe standard supports BAR Size from 1 MB to 512 GB.

I don't think PCIe Resizable BAR is exclusive to RDNA architecture either; need to look into Linux driver code for a list of supporting hardware though.

That's interesting... So what is stopping you, in the near future, from using a 512GB NVME drive and through DirectStorage, "connecting it to the GPU" and having SAM reconfigure the BAR to include some of the NVME drive.
I know DirectStorage won't work like that, but would the above scenario even be beneficial?

tsa1 · Oct 30, 2020

Above 4G decoding option is available on all AM4 motherboards, AFAIR, but I actually doubt that it's the only requirement for SAM. For some reason, xGMI links are present and it'd be strange if they are completely unused, considering that HSA was the 'idee fixe' for AMD not that long ago.

DmitryKo · Oct 30, 2020

pjbliverpool said:
Allowing each device to see the others memory pool as if it were it's own and keeping the caches between both CPU and GPU coherent? So this becomes similar to a UMA?

This is the impression I've got from Tom's Hardware article and Dark Side of Gaming article, which basically say that "CPU and GPU gain full access to each other’s memory".

They also say it's similar to Raven Ridge APUs and Infinity Architecture 3, though the Financial Analyst Day 2020 presentations only mention enterprise-grade EPYC and CDNA (Radeon Instinct) chips having Infinity Architecture 3.

AMD's definition of Smart Access Memory in PR materials is different (and more ambiguous):

In conventional Windows-based PC systems, processors can only access a fraction of graphics memory (VRAM) at once, limiting system performance. With AMD Smart Access Memory, the data channel gets expanded to harness the full potential of GPU memory - removing the bottleneck to increase performance.

People interpreted it as support for Resizable BAR and PCIe 4.0 bandwidth - but AFAIK these two technologies are not really exclusive to RDNA2 (RX 6000) and Zen3 (Ryzen 5000).

andermans said:
No, the caching stuff was available before this already since forever (pre-GCN at least)

Only for system memory pools, but not for local video memory pools.

https://gpuopen.com/learn/vulkan-device-memory/
https://gpuopen.com/events/gdc-2018-presentations/
https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VkMemoryPropertyFlagBits.html
https://computergraphics.stackexchange.com/questions/7504/vulkan-how-does-host-coherence-work

DmitryKo · Oct 30, 2020

LordEC911 said:
So what is stopping you from ... using a 512GB NVME drive and ... having SAM reconfigure the BAR to include some of the NVME drive

NVMe is a block I/O protocol for disk devices - it uses LBA sector numbers to access disk data, which are remapped to actual flash memory addresses by the NVMe controller. It can use PCIe memory mapping for the optional Host Memory Buffer (HMB) feature in entry-level DRAM-less controllers, but flash memory is not visible to the host.

Deleted member 2197 · Oct 30, 2020

tsa1 said:
Above 4G decoding option is available on all AM4 motherboards.

Interesting. I wondered what that toggle was for in the Intel X99 MB bios.

AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

nutball

marifire

Picao84

Kaotik

Drunk Member

tunafish

Picao84

tsa1

Kaotik

Drunk Member

tsa1

DmitryKo

Deleted member 2197

Guest

Svensk Viking

Erinyes

pjbliverpool

B3D Scallywag

andermans

LordEC911

tsa1

DmitryKo

DmitryKo

Deleted member 2197

Guest

Similar threads