I think that it is clear that the first Navi products won't have the VRAM, Bandwidth or compute power of a Radeon VII. To me, Navi has always sounded like what Polaris was to Fiji (and later Vega): aimed at the low/mid segment of the market (and consoles)."[Navi] will be positioned below current Radeon 7 at least in terms of price point" -Lisa Su during earning call.
She doesn't directly say it will perform less than the Radeon 7 but the tone heavily points at it being less powerful but leaves room for speculation that it would be more powerful but priced lower. Otherwise, she does say Q3 launch and that the pipeline is good.
Well obviously these aren't HPC parts for sure.I think that it is clear that the first Navi products won't have the VRAM, Bandwidth or compute power of a Radeon VII.
Don’t really see how you figure that. Polaris was introduced to the market three years ago, and will likely be produced for a while yet.6nm is barely entering risk production, by the time it goes volume, Navi will be no more.
Don’t really see how you figure that. Polaris was introduced to the market three years ago, and will likely be produced for a while yet.
If Navi is introduced this autumn, a low cost mid life refresh could very well be in the cards. But without knowledge of either AMDs product plans or the future competitive landscape, any speculation that is years out is on thin ice.
Arcturus is not an architecture, it’s a design instance (like Vega 20 e.g.)Nah, it's compatible with tapeouts of 7nm. So assuming Navi and Zen 2 are on 7nm and not 7nm+ (any word on that?) then a switch to 6nm next year before Arcturus is an easy bet, at least for the GPUs/CPUs launching this year.
In fact that's probably why TSMC would make it at all. A "hey come use our sorta 7nm+ node without having to re-tapeout your entire chip!" Mild improvements like this have become increasingly common as the ability to shrink silicon has ground to a halt. How many versions of the last node did each foundry have? I know it's been at least 3 for TSMC and Samsung, and like 5 for Intel (trololol)
You're right. Do $is actually very similar to nVidia's operand collector, especially LRF in operand collector. It can solve a large part of bank conflict problems, but it can't be completely avoided.So I think compilers still need to care about this.Going by the super-SIMD patent, operands are gathered from a bank over several clock cycles and stored in buffers ahead of the ALUs. A single row would collect each source register once per cycle, and then move down the ALU pipeline. That prevents a bank conflict occurring within a single FMA instruction, and a significant point of the patent was to utilize wasted register access cycles for instructions that didn't consume as many operands as an 3-operand FMA by allowing a simpler operation to borrow the unused cycles. At least within the scope of that method, the existing access method could permit conflict-free access.
Regardless of what she says about the pipeline, Navi is already ridiculously late, and by Q3 it will be later still.Otherwise, she does say Q3 launch and that the pipeline is good.
whatNvidia had opened a flank with the RTX series, yet here's another lost opportunity from RTG.
what
What makes you think Navi would magically stop being better perf/$ if launched one month later?My guess is they had an opportunity to launch better card performance/price wise in classic rasterisation, and they didn't ?
What makes you think Navi would magically stop being better perf/$ if launched one month later?
The RTX series allows them to sell such weird product as Radeon VII. This wouldn't be viable without a contribution of the current RTX line.Nvidia had opened a flank with the RTX series, yet here's another lost opportunity from RTG.
Current RX offer is quite good when it comes to perf/$. If you mean they don't have dedicated RTRT hardware, that's no big issue in my opinion. At this moment RTX is just hype. Much worse than VR was Maxwell/Fiji era (4y ago)I'm not talking about 1 month, I'm talking about the fact that they had/have nothing against RTX.
Current RX offer is quite good when it comes to perf/$. If you mean they don't have dedicated RTRT hardware, that's no big issue in my opinion. At this moment RTX is just hype. Much worse than VR was Maxwell/Fiji era (4y ago)
It's entering risk Q1'20.I suppose this means a bigger Navi card, and probably new console chips, will be on 6nm next year
There's apparently an S_INST_PREFETCH instruction that will freeze a shader, which seems odd to document as an available instruction if it's that bugged.
// Pre-GFX10 target did not benefit from loop alignment
if (!ML || DisableLoopAlignment ||
(getSubtarget()->getGeneration() < AMDGPUSubtarget::GFX10) ||
getSubtarget()->hasInstFwdPrefetchBug())
return PrefAlign;
// On GFX10 I$ is 4 x 64 bytes cache lines.
// By default prefetcher keeps one cache line behind and reads two ahead.
// We can modify it with S_INST_PREFETCH for larger loops to have two lines
// behind and one ahead.
// Therefor we can benefit from aligning loop headers if loop fits 192 bytes.
// If loop fits 64 bytes it always spans no more than two cache lines and
// does not need an alignment.
// Else if loop is less or equal 128 bytes we do not need to modify prefetch,
// Else if loop is less or equal 192 bytes we need two lines behind.