AMD: Navi Speculation, Rumours and Discussion [2019-2020]

trinibwoy · Oct 22, 2020

Just above 2080 Ti performance for $500. Wouldn't it be something if the 6800 xt is also $500.

PizzaKoma · Oct 22, 2020

For Nav21 speculation that is missing the point though, what is interesting is that we now know that the 6800XT is minimum 10k+ points @ FSU. Which means that the 6900XT is probably ~10-15% faster so approx, 11-12.5k points for FSU performance. But purely conjecture.

There is a comment in the thread which is interesting:

https://twitter.com/x/status/1319227144992677889

trinibwoy · Oct 22, 2020

PizzaKoma said:
we now know...

We actually don’t know anything yet.

PizzaKoma · Oct 22, 2020

Oh we do:
https://www.anandtech.com/show/1615...6000-card-performance-numbers-aiming-for-3080

Pieces of a puzzle, several known data miners are never wrong together? Or :runaway:

It is true that we never know anything, it is just multi-subjectively agreed upon realities and/or conclusions.

3dilettante · Oct 22, 2020

trinibwoy said:
Ok, it’s probably true that the shader doesn’t need to inspect the contents of the node in order to schedule it. But that doesn’t seem to be a notable benefit of shader based scheduling given it’s also the case for Nvidia’s fixed function approach.

The original comparison was with hypothetical RT block that only gave intersection results while not performing traversal, which would leave the SIMD in a position where determining the next node addresses would require explicit vector memory reads to data that would have been fetched and parsed by the RT unit already. AMD's method is at least less redundant than that.

AMD’s patent calls for storing traversal state in registers and the texture cache. It would seem the shader is responsible for managing the traversal stack for each ray and that stack presumably lives in L0. I don’t see how you would avoid thrashing the cache if you try to do anything else alongside RT. Unless of course you have an “infinite” amount of cache

AMD's patent doesn't clearly outline where the process resides for the intermediate work between node evaluations. It highlights that the SIMD and CU have substantial storage available at no additional cost versus the likely hardware footprint of implementing sufficient storage on an independent unit.
AMD's claims are between their hybrid method and a dedicated unit implementing a unit that might be able to traverse a BVH to arbitrary depths without redoing traversal due to losing the full context of what had been traversed already.
Nvidia's scheme appears to have a traversal stack of finite depth that can lead to redundant node traversal, which makes it less expensive than what AMD was using as its baseline.

Whether AMD's method leverages registers, LDS, or possibly spills to memory isn't spelled out. Even if there were spills to memory, writing out data based on pointers and metadata from completed RT node evaluations to something like a stack seems like it could be less disruptive than the SIMD re-gathering node data on its own.

iroboto said:
curious, aren't all ROPS typically tied to caches past and current gen?
IIRC the difference with RDNA is that compute is now tied in with the L2 cache, whereas with GCN it went directly to the memory controller. But I think ROPS are unchanged.

ROPs were linked to memory channels until Vega, which made them a client to the L2.
RDNA makes them a client for at least some of their traffic of the new intermediate L1.
GCN had a read-write L2, and compute's use or non-use of the cache depends more on what settings were used for the memory accesses. The choice would be based on the level of coherence needed for the data.

ergo this older post by sebbbi:
https://forum.beyond3d.com/posts/1934106/

with respect to RDNA

it does look like they changed how they accessed data however for the RBs.

Render back ends have had relatively small per-RBE caches throughout the generations. There's evidence that the RBEs still have caches with RDNA, though I haven't seen specific capacities given.

PizzaKoma · Oct 22, 2020

https://twitter.com/x/status/1319302423702364161

SimBy · Oct 22, 2020

6800XT is that the 80 or 72CU one?

Rootax · Oct 22, 2020

SimBy said:
6800XT is that the 80 or 72CU one?

If rumors are true, 72. 80 are 69xx ... ?

Erinyes · Oct 22, 2020

trinibwoy said:
You may be right that it’s more balanced. In terms of absolute performance though it’ll be really interesting to see where the chips fall.

And not being power limited, it might also punch a bit above its weight as it would be able to clock very high. If so, the gap between the hypothetical 64CU salvage part and the top 40CU part may not be as high as the difference in raw specifications. Would also explain why there is no 60CU die.

PizzaKoma said:
https://twitter.com/x/status/1319221360867086336

Meaningless comparison. How much does 3080 score?

trinibwoy said:
Just above 2080 Ti performance for $500. Wouldn't it be something if the 6800 xt is also $500.

Would definitely shake up the upper mid-range market. The market has been waiting for this kind of competition ever since the GTX 1070.

neckthrough · Oct 22, 2020

3080 is 10,667 on Fire Strike Ultra per Guru3D.

6800XT is clearly gunning for that perf tier (at least in terms of rasterization). The cache based design (if true) is going to cause some interesting variability in relative perf game-to-game performance. Given that the hardware can get there, it stands to reason that AMD would tune the product to be on average at par with (or even slightly better than) the 3080 — again in terms of traditional rasterization. DXR and ML are completely unknown.

The big question is how AMD decides to price it. From the customer perspective 16GB gives it an edge, but they are going up against NVIDIA brand recognition, DXR, DLSS. I think $599 would make this an interesting proposition and $499 would make it a really excellent value proposition.

I wonder what the equation looks like on the cost/supply/margin side for AMD. They’ve got 16GB and TSMC7 working against them here but perhaps they don’t need as sophisticated a cooler as the 3080FE. Zen is bringing in the big bucks so they don’t need a ton of margin on Radeons, but I doubt they want to sell these close to cost, not when they are competing for Zen wafers.

Krteq · Oct 22, 2020

https://twitter.com/x/status/1319357641085386764

Krteq · Oct 22, 2020

Well, tweets have been removed, but I had a backup

Scott_Arm · Oct 22, 2020

If the benchmarks samples they gave are a 6800XT with 72 CU, then it looks like it'll be close to 3080 in 4k, but has a lot of potential to beat it handily at 1080p and 1440p. Realtime raytracing will be the most interesting. I'm curious how competitive it'll be at native resolutions.

Krteq · Oct 22, 2020

Krteq said:
Well, tweets have been removed, but I had a backup

It seems like some AIB OC model (beefed up VRM, higher PCB etc.).

Photo of reference card for reference (weird sentence

)

... and some recent tweet

https://twitter.com/x/status/1319372044035907614

trinibwoy · Oct 22, 2020

Krteq said:
... and some recent tweet

https://twitter.com/x/status/1319372044035907614

Really impressive if it's really a 256-bit bus @ 16Gbps.

Tofu · Oct 22, 2020

Why is there kapton tape where the GDDR6 memory modules should sit?
I mean, why would you populate a complex pcb when you can not use it afterwards? (I assume, that one would bake the GDDR6 first to the circuit board - or, is this a false assumption?)

Krteq · Oct 22, 2020

https://twitter.com/x/status/1319373361789755392

manux · Oct 22, 2020

One has to wonder if it's 12+4 setup with 384 bit bus. Though that would likely not go well. Some 970 models that tried slower memory pool got a ton of shit when things became slow.

SimBy · Oct 22, 2020

It’s funny though how leaky the partners are. No wonder they are completely left in the dark.

PSman1700 · Oct 22, 2020

All this leaks and rumors putting AMD in a bad light with meager BW etc. Its less then a week left and we know everything.

AMD: Navi Speculation, Rumours and Discussion [2019-2020]

trinibwoy

Meh

PizzaKoma

trinibwoy

Meh

PizzaKoma

3dilettante

PizzaKoma

SimBy

Rootax

Erinyes

neckthrough

Krteq

Krteq

Scott_Arm

Krteq

trinibwoy

Meh

Tofu

Krteq

manux

SimBy

PSman1700