AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
New slide from AMD: is the 5700X really faster than the RTX 2070? It's barely 17% faster than a 1070 in 3 major titles! At this point it would be lucky if it's actually equal to the 2070.

1ek6yim5w5431.png
 
It's hard to interpret these results, since the CPUs are different.
This, plus the slide looks fake/strange? It's clearly not from the from the "AMD: Next Horizon Of Gaming Tech Day" slide deck as it says "The Next Generation Of Compute" and features the old oragne/black template. What the hell does the Multi-Thread graph represent and why is it offset relative to the other? Alien Isolation? Hitman 1 ? Rise of the Tomb Raider ??? Why benchmark 3/4 years old games when the most recent genuine slide deck (Next Horizon) contains none of them? :neutral:

lll.jpg
 
Last edited:
I understand the logic, but it's not. It's leaked from a R7 3700X promo material, the leaker is well known and has leaked many things before.



That doesn't mean it's not a fabrication. Comparing against 3 year old hardware in 3 year old games and producing a result which can only be described as laughable. Doesn't stand to reason

Why doesn't the Multi Thread column line up with the rest and also "The most and fastest connectivity"
 
Most of what you're asking is beyond what I was briefed on and is beyond my own expertise. But I'll answer what I can.
Yes. There are 2 shader engines.
I believe a lot of this is stylistic, but a lot of work has gone into improving their work (re)distribution. It's something I need to look more into.

A new feature called Priority Tunneling has been added. Notably, this is not context switching. But it does allow the AWS to go to the top of the execution pipeline and block any new work being issued, so that it can be drained and a compute workload started immediately thereafter.

Can AMD do 3 shader engines and 96ROPs similar to nvidia's biggest chips or they still have to go for 4SEs and be content with 64ROPs since 128 would be too much? Any plans on doing a Vega vs. Polaris clockspeed jump with new Navi cards on same process?

New slide from AMD: is the 5700X really faster than the RTX 2070? It's barely 17% faster than a 1070 in 3 major titles! At this point it would be lucky if it's actually equal to the 2070.

1ek6yim5w5431.png

3 major titles from 3 years ago?

Looks like RDNA requires driver optimizations like GCN did at launch, Navi cards might end up looking even better by the time they release, at least compared to Pascal.
 
so do devs have to target a 32 wide wave or is this something that gets decided automatically by the compiler or something?

and further and this may sound silly but as a layman who doesnt really understand it all that much, at what point does this sort of thing happen, are shaders compiled and shipped as a binary or are they compiled on start up or per frame or what?
 
3 major titles from 3 years ago?
The rest of the titles don't have a stellar fps uplift over the 1070 either, except Warhammer 2 maybe.
Looks like RDNA requires driver optimizations like GCN did at launch,
Looks like it, most of these titles are CPU limited, a usual weak point in AMD's driver.
That doesn't mean it's not a fabrication.
Once more IT IS NOT a fabrication. No amount of creativity could conjure 3 slides like this (including the box art). You will see once the launch happens.
 
Last edited:
A new set of LLVM changes reference GFX1011 and GFX1012 variants of the ISA. These add support for the dot product instructions introduced with Vega 20 (hasdot1insts, hasdot2insts) as well as an additional set (hasdot5insts,hasdot6insts).
The general feature flags appear consistent with GFX1010, although GFX1011 is missing a bug for LDS usage during workgroup processing mode. That might be a bug fix in a newer derivative, or alternately there's a difference in workgroup mode support.

https://github.com/llvm-mirror/llvm...00adaf0#diff-ad4812397731e1d4ff6992207b4d38fa
 
and further and this may sound silly but as a layman who doesnt really understand it all that much, at what point does this sort of thing happen, are shaders compiled and shipped as a binary or are they compiled on start up or per frame or what?
It "depends". Usually, you ship them either as intermediate code (SPIR-V or D3D shader objects), or you ship them in source code. Except for GLSL source code passed to OpenGL API, that means there are in total two compilation passes, one to vendor independent, API specific intermediate representation, and then inside the driver once again into hardware specific assembly.

Since Vulkan / DX12, you can also ship a shader cache, which contains the native representation for a specific platforms, which can save the second compilation step. In older APIs, that cache is internal driver implementation detail.
so do devs have to target a 32 wide wave or is this something that gets decided automatically by the compiler or something?
Not exposed explicitly. To start with, it only makes a difference when explicitly writing a compute shader with a specified workgroup size. For all non-compute, it's an invisible implementation detail.
For compute you choose explicitly the size of the workgroup (total number of threads), but what wave size is then used to provide the requested workgroup size is once again driver internal implementation detail.
 
so do devs have to target a 32 wide wave or is this something that gets decided automatically by the compiler or something?

and further and this may sound silly but as a layman who doesnt really understand it all that much, at what point does this sort of thing happen, are shaders compiled and shipped as a binary or are they compiled on start up or per frame or what?
Shaders are shipped as bytecode and compiled on the client for the installed hardware. This can take some time (My project takes 30min for NV, but only <1 min on AMD, interestingly).
To avoid this the compiled results are stored on HD to be ready at the second launch of the app (shader cache). At the first launch the compiling in the background can cause shuttering and bad perf - bad initial code may be replaced with optimized code after some seconds. Important to know for benchmarks.
I don't know how this behavior varies between APIs, or if it's possible to ship precompiled shaders for most hardware.

On console i guess shipping compiled shaders makes more sense, and also likely the manual choice between 64 / 32 waves will be possible for compute.
 
Once more IT IS NOT a fabrication. No amount of creativity could conjure 3 slides like this (including the box art).
D89YkNjVUAAoUv9.jpg:large

Edit: To make it short "Ryzen" word without "TM", product logo floating unaligned to company logo. It's not even the official Ryzen 7 logo used in the current presentations. Those slide look different. Where they are edited, outdated or made in haste, I wouldn't take such content as predicative.

Edit 2: Here is the background image, here is image of the box.
 
Last edited:
Random things I noticed on later skimming of the LLVM commit I linked to earlier:
While the dot product instructions do have a lot of commonality between GFX1011 and GFX1012, there are some differences like GFX1012 showing extra mentions for atomic instructions that the other GFX10 variants did not. This may have more importance in a compute scenario.

GFX1011 has a sparse set of entries for errors with some instructions and little documentation. I suppose it'd be fun to mention a few lines here:
image_bvh_intersect_ray v[4:7], v[9:24], s[4:7]
// GFX10: error:

image_bvh_intersect_ray v[4:7], v[9:16], s[4:7] a16
// GFX10: error:

image_bvh64_intersect_ray v[4:7], v[9:24], s[4:7]
// GFX10: error:

image_bvh64_intersect_ray v[4:7], v[9:24], s[4:7] a16
Since this is a skeleton of an error file, I am not sure what prompted their mention here at this time.
 
What’s up with the funny text? That’s a weird choice for a font in a professional marketing setting. LOL. It looks like the photoshop version of someone cutting out letters individually from a magazine and pasting a message together by hand. The inconsistent spacing between the words and letters really offends my sense of symmetry. It looks totally out place in promotional material.
 
D89YkNjVUAAoUv9.jpg:large

Edit: To make it short "Ryzen" word without "TM", product logo floating unaligned to company logo. It's not even the official Ryzen 7 logo used in the current presentations. Those slide look different. Where they are edited, outdated or made in haste, I wouldn't take such content as predicative.

Edit 2: Here is the background image, here is image of the box.
Actually I can confirm that this and the other slides posted by Komachi Ensaka are real and straight from AMD. They're from Travis Kirschs and Don Woligroskis Ryzen 3000 Series Deep Dive -presentation

This, plus the slide looks fake/strange? It's clearly not from the from the "AMD: Next Horizon Of Gaming Tech Day" slide deck as it says "The Next Generation Of Compute" and features the old oragne/black template. What the hell does the Multi-Thread graph represent and why is it offset relative to the other? Alien Isolation? Hitman 1 ? Rise of the Tomb Raider ??? Why benchmark 3/4 years old games when the most recent genuine slide deck (Next Horizon) contains none of them? :neutral:

lll.jpg

This is legit too, from Laura Smiths and Mithun Chandrasekhars Radeon 5700 Breakout -presentation

edit: and the one posted by @DavidGraham is legit too, it's from the Ryzen 3000 Series Deep Dive -presentation too

edit2: read too quickly @Ike Turner you knew your slide was legit but were wondering about DavidGrahams, oh well, no harm double confirming yours
 
Still haven't seen anything regarding actual power consumption in any of the documentation...

Can anyone make sense of this slide along with the endnotes?

perfwatt.JPG


It would appear that performance/watt part was based on a Division 2 run at 2560x1440 Ultra settings, with a 40CU Navi vs a "Vega 64" with 40CUs enabled. So that 14% more performance doesn't seem great vs a 40CU Vega 64 but the 23% less power is interesting if a Vega56 is around 210w.
Edit- the 14% increase in performance is the exact same as the clock difference between Navi's gaming clockspeed and Vega 64's boost clockspeed...

Then the area comparison of a 14nm "Vega10" is Vega 56 vs Navi. (I assume they used a 40CU Navi for bigger numbers rather than the 36CU Navi)
 
Last edited:
Still haven't seen anything regarding actual power consumption in any of the documentation...
Closest thing at this point would be 180W and 225W TBP which they did confirm. No hard numbers on actual consumption yet AFAIK though
 
A brief look at the most notable changes inside Navi [CUs and caches], with a short explanation about backwards compatibility with GCN for console use.
 
Actually I can confirm that this and the other slides posted by Komachi Ensaka are real and straight from AMD. They're from Travis Kirschs and Don Woligroskis Ryzen 3000 Series Deep Dive -presentation



This is legit too, from Laura Smiths and Mithun Chandrasekhars Radeon 5700 Breakout -presentation

edit: and the one posted by @DavidGraham is legit too, it's from the Ryzen 3000 Series Deep Dive -presentation too

edit2: read too quickly @Ike Turner you knew your slide was legit but were wondering about DavidGrahams, oh well, no harm double confirming yours


Ca you provide any context around the graphs provided in the second one. Are they showing showing a comparison of relative performance between the two systems listed, and if so is there anything noteworthy about the tests?
 
Status
Not open for further replies.
Back
Top