AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
I'm curious why people are still insisting "Navi is midrange", Vega was supposed to be according to similar rumors be "high end" but it covers the whole market from top to bottom (top of AMD anyway)
 
Because all of the rumors / leaks said that ? Like, Navi 10 will be around vega56 perfs, but cheaper and use less power ? The only time high end is talked about, it's Navi20, but it seems to be release later, if it's a real thing. And Vega didn't cover bottom, rx580/590 is still a thing, no ?
I would be glad to see a high end navi, fighting nVidia 2080ti (or what nVidia will have in stock at the time), but I read nothing telling me it's the goal with Navi.
 
I'm curious why people are still insisting "Navi is midrange", Vega was supposed to be according to similar rumors be "high end" but it covers the whole market from top to bottom (top of AMD anyway)
The "Vega rumors" were correct - we received the highend attempt by Greenland aka VEGA 10, right? Midrange VEGA 11 got canceled and the VEGA 12 got terribly delayed.

Now the first release based on Navi is rumored to be a midrange chip labeled NAVI 10. The second rumored chip is called NAVI 20. It's rumored to be a highend attempt coming a few Qs after the first chip. Sure, there might or might not be a whole Navi-based lienup. That's beyond the rumors scope.

Given the need for an early introduction of AMD "Next Gen" architecture, the highend Navi in 2020 makes someone doubt.
 
Given the need for an early introduction of AMD "Next Gen" architecture, the highend Navi in 2020 makes someone doubt.

NAVI 20 could be what Vega 20 was to Vega 10 ( FP64, xGMI, ECC, etc.) Still expecting 4SE and 64 CU though...
 
When is uarch after navi rumoured to be?
Although I'm sure all timeliness have changed a fair bit by now.
When was navi originally due?
 
I'm curious why people are still insisting "Navi is midrange", Vega was supposed to be according to similar rumors be "high end" but it covers the whole market from top to bottom (top of AMD anyway)

All rumors point to the first release or two being more midrange, with high end coming next year. It appears Navi is still a monolithic die, and bigger designs are getting harder and more expensive to make (thus every chip designer talking about chiplets being the future). So a smaller die this year and waiting till the process and tools are better to make a big die makes sense, I mean just look how much they're charging for the 7nm ML cards.

We also don't really know what architecture Navi might be considered. The trend has said another GCN update, with Arcturus being a major redesign for 2021. But GCN obviously doesn't include any sort of dedicated RT cores. Of course just how much support for raytracing normal Navi has, versus the semi custom PS5 chip coming next year (and assumedly the Xbox whatever coming then too) isn't known.

Obviously you don't need dedicated RT cores for DXR compatibility, in fact both non RTX Turing and Volta have native DXR support. It's just some extra hardware instructions to do with thread spawning and other such things. But the performance deficit in current "RTX!!!!" titles is quite a bit without dedicated raytracing. Though it is much less in raytraced games that weren't built with RTX in mind (Claybook/Kindgom Come Deliverance/etc.).
 
I noticed this story concerning some commits surrounding changes for GFX10 https://www.phoronix.com/scan.php?page=news_item&px=Open-Source-Navi-GFX1010.
Subsequent checks of github show additional changes are wending their way through the process.

An un-ordered list of things I ran across in my browsing:
Primarily from https://github.com/llvm-mirror/llvm...3380939#diff-ad4812397731e1d4ff6992207b4d38fa with some references to some recent commits.
*note that some areas seem more preliminary than others and could change

There's a few feature flags for GFX10 that are interesting:
HasNoSdstCMPX - a change in how the cmpx variant of comparison instructions work, where prior ISAs would write to the execution mask and a vector condition code (an aliased register mapped to 2 scalare registers but also buffered in some way on the vector side). If I'm interpreting the ISA document, this would mean that the non-x variant writes VCC only, and the x variant writes only to EXE rather than both. This may be part of the last feature flag in this group.

HasVscnt - there's a separate counter for stores. This may go with one of the ordered memory flags, or could be related to some of the new instruction encodings. I didn't see where exactly this was used or how, although perhaps it's related to why GFX10 has quadrupled the count for lgkmct. There may be a differentiation in writes versus reads to memory, and the writes may be linked more closely to the export/system message path.

HasRegisterBanking-unclear on what this may entail. This may mean GCN's traditional lack of concern about what register addresses are used at the same time has changed, and that certain patterns may cause stalls. Whether this is in some way related to any of the patents mentioned before isn't clear. For many of them, the register file already had banks of some sort and the addressing logic and wavefront cadence meant accesses didn't conflict regardless of physical banks being present.

HasVOP3Literal - GCN's ISA gave the option for a 32-bit literal after the shorter vector encodings, while the longer 3-operand VOP3 instructions did not. Presumably, GFX10 has altered the pattern for instruction+immediate so that the 64-bit VOP3 operations can have an immediate with a length I didn't see specified.

HasNoDataDepHazard- there's a string stating "Does not need SW waitstates". There are still wait counts, and probably even more of those. This may refer to the ISA doc section on required NOPs for various places where the pipeline doesn't check whether a result has written back to a register or does not forward results immediately. This seems to indicate that many of these special cases are now being caught or have been made unnecessary. That could be from better interlocking of the pipeline or reduced forwarding latency in some places. That doesn't mean there still aren't wait states, as there are new architectural hazard flags for non-data dependencies that appear to map to some of the existing set. As for why, it might be from a revamped implementation of the pipeline, or could have been precipitated by some other concern like new instructions, or concern over an increasing size of the set of hazards. Another random thought is that the wait states needed, and their lengths, aren't quite in sync between Sea Islands--used by the consoles--and later GCN architectures. A backwards-compatible architecture might try to mold its latencies to match, or find a way to forward/stall where hazards could arise.


A few omissions or reappearances of features are also interesting.
GFX10 marks the return of a flag indicating there's no SRAM ECC, which seems expected for a gaming architecture.
A minor reintroduction is FeatureMIMG_R128, which is a bit used for texture resource formats that Vega removed.

A potentially larger omission is the lack of FeatureGCN3Encoding for GFX10. I have seen discussion in various fora that Navi is a repudiation of Vega, and that it's a return to Polaris or something like that. However, the lack of GCN3 encoding flag (and I reviewed some of the opcodes listed in later updates) makes it seem like a significant number of opcodes have been changed to match the console-generation instructions, if they were present at the time. This means before Polaris, Tonga, and Fiji. Architectural advances since Sea Islands appear to still be present, such as the various parallel and packed extensions and scalar memory operations. There are also references to Primitive Order Pixel Shading (POPS) in other scalar ISA commits that were in the Vega ISA doc, message types from Vega not supported by other GPUs, and some things like DLI instructions from Vega 20.*
*One possible caveat: I am not sure whether there's more to interpret from the decision to move the scalar operation flags and others into a separate sub-version 10.1 versus the overall GFX10 set. That may mean some variation of Navi could be missing one or more of these operations, and the lack of the scalar ops would be more like the consoles--though it might be more of a regression than some of the more niche flags.

So GFX10 appears to have a mix of returning some operations in a way that might align it with the consoles, while still having more recent or new features from GFX8 and GFX9. Some changes like the HasNoSdstCMPX change, might be a place where Navi deviates from both the console and PC space.



Other items:
There's some sort of NSA encoding for image (texture) instructions. It's mentioned alongside MIMG instructions in a bug flag for GFX10.
There's apparently an S_INST_PREFETCH instruction that will freeze a shader, which seems odd to document as an available instruction if it's that bugged.
There's a speed model for GFX10 that seems to point to some extra latency in the pipeline. The model doesn't divide cycles by 4 due to the cadence, although many of the high numbers are more consistent with GCN prior if they were. There's a comment about an extra cycle for vector register reads that might throwing off a clean division.
 
This open source activity finally confirms the GFX10/Navi is really to be launched some time this summer/fall. Not like those March-launch baseless speculations...
 
I think we have more questions than answers after seeing that.

Why would they use 256bit GDDR6 for a 250-300w GPU?
Seems more likely that they will reduce pins before launch down to ~200w.
 
Navi is Navi, and the one in consoles definetly won't be the smaller one.
So 250W console GPU it is.
Errr, what? Navi is architecture, there will be several different implementations of it and one pulling x watts doesn't mean squat for the rest.
It remains to be seen whether consoles will use monolithic chips or chiplets, but either way their implentations of Navi have nothing to do with any desktop chip other than the architecture
 
Status
Not open for further replies.
Back
Top