AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Its like you trying to compare 25 tflops of 16bit precision to Volta's 32 bit flops to prove a point, you just don't do that, two totally different metrics which have no correlation with each other.
Those are the FP16 rates that are coorelated. If applications are making heavy usage of FP16 it would be foolhardy to think the device with double the theoretical figures wouldn't be faster. And I was comparing to Pascal, but it may apply to Volta. As for current comparisons Polaris looks to be doing just fine and pulling ahead in newer titles as expected.
 
Those are the FP16 rates that are coorelated. If applications are making heavy usage of FP16 it would be foolhardy to think the device with double the theoretical figures wouldn't be faster. And I was comparing to Pascal, but it may apply to Volta. As for current comparisons Polaris looks to be doing just fine and pulling ahead in newer titles as expected.


Really against a card that has 30% less tflops and uses 30% less power (which isn't even remotely close to the highest perf/watt card in the line up)? You call that equal? And no I haven't see Polaris pull ahead in newer games, the games it does pull ahead in are only games sponsored by AMD, coincidence, not really, we see this all the time.

Yeah just like the rx580 which gets what 5% more performance than the rx 480 but uses a butt load more power....

And no those FP 16's don't correlate to FP 32 to Volta, why not put FP 16 tflops of Volta to the FP 16 flops of Vega? What 120 tflops of FP 16 too much for Vega to handle? You think outside of tensor operations Volta will be able to use those 120 tflops, I don't. So lets take the 30 TFlops of 16 bit performance to compare to the 25 tflops of Vega. Then those are comparable. And Volta still has the lead. Then factor in CUDA based software that has a butt load more optimized extensions than ROCm at this point. Leaves Vega is the dust for DL. Because of CUDA that is why Pascal can do what it does in DL, so until ROCm implements those similar optimizations in their API's and Libraries, AMD will always be second fiddle. And at that point then get people (colleges, and the industry professionals) to start learning to code with ROCm, then they have a chance to do something, but that is so many years down the road.

So why do you think AMD hasn't talked about their tensor related flops yet? Yeah if they had a number higher than 25 Tflops I'm sure they would be shouting it out.
 
Last edited:
It's a giant all in one device that mounts to a wall. Is it really a surprise a mobile part doesn't run as fast as a server part?


It comes back to what Raja stated Vega FE will have the most compute performance of all Vega's.

That's not outside the box, it's how past devices worked with slight modifications. What's strange is you assuming AMD would release two different new architectures the same time. Then dismissing the marketing materials calling out Volta, similar release dates, display support, memory models, etc.

Right, and AMD came up with this idea to take on the TPU market where they have no DL market to begin with a market that would be better suited for their architecture.

Businesses don't work on myth, either you have it or you don't, and if you have it, can it be used out of the box, with out the software, its not usable period, no business will use AMD products unless they have money to waste and get them functional enough to compete. Guess what ROI just got thrown into the toilet and flushed repeated. Just imagine, spending 1 years dev time to get AMD products up to speed, what is the loss from inception of such a project vs the cost of getting nV products that work out of the box? Its not just the cost of the time of development but in that entire time they can have things up and going on competitors products bringing in money, instead of wasting that time. So double possible triple or quadruple?

So by AMD totally ignoring their Tensor capabilities of Vega, is a pretty big statement.
 
Last edited:
https://videocardz.com/newz/amd-releases-vega-gpu-die-shot

AMD-Vega-Die-Shot-1000x998.jpg
 
That iMac Pro has limited space and thus cooling and it can house 18-core Xeon, Vega and all the rest of the stuff and power it with just 500W PSU. Of course they have to cut somewhere
 
Err, this is not a die shot but more like an artistic representation of the site like nvidia does?
Artistic representation on top of a die shot from the looks of it. Obscuring the actual logic underneath.

And no those FP 16's don't correlate to FP 32 to Volta, why not put FP 16 tflops of Volta to the FP 16 flops of Vega? What 120 tflops of FP 16 too much for Vega to handle? You think outside of tensor operations Volta will be able to use those 120 tflops, I don't. So lets take the 30 TFlops of 16 bit performance to compare to the 25 tflops of Vega. Then those are comparable. And Volta still has the lead. Then factor in CUDA based software that has a butt load more optimized extensions than ROCm at this point. Leaves Vega is the dust for DL. Because of CUDA that is why Pascal can do what it does in DL, so until ROCm implements those similar optimizations in their API's and Libraries, AMD will always be second fiddle. And at that point then get people (colleges, and the industry professionals) to start learning to code with ROCm, then they have a chance to do something, but that is so many years down the road.

So why do you think AMD hasn't talked about their tensor related flops yet? Yeah if they had a number higher than 25 Tflops I'm sure they would be shouting it out.
Nvidia didn't even list FP16 rates for Volta that I've seen, so the assumption is they don't support the packed math like prior architectures and Vega. Only Tensor ops with very limited functionality. Google's TPU only supported 5-6 instructions as I recall. So Vega's FP16 and Volta's FP32/16 rates would be comparable until that changes. The rest of your post is just more of your usual mental gymnastics. I'm not sure why you're so caught up on colleges needing to each ROCm. The end result will likely be accelerating all applicable languages through LLVM and it seems reasonable that all languages but CUDA will hold a larger marketshare for some time. The real difference is that AMD won't be reliant on their own programming language to make their products work.
 
Err, this is not a die shot but more like an artistic representation of the die like nvidia does?
Yeah obviously not a die shot, not sure why videocardz labelled it as such. Likely for search keyword capturing and clickbaiting.
 


The thing that I found interesting was VCZ pointing out that the "GPA022GA2656" coding on the memory stacks don't use typical SK Hynix coding. It's probably made-up by the artist that did this drawing, but...

Recalling SK Hynix's dimensions, I can't help but investigate.

hbm2_mechanical.png


SK Hynix's HBM2 stacks are 7.75x11.87mm. 1.531x ratio of length to width. From what I understand, only the stack height is regulated by the HBM spec, so those dimensions (and their ratios) are probably unique.

A sloppy crop job yields:

258x162 px => 1.593x
x0isjhJ.png


279x179 px => 1.559x
LkWd3i2.png


I don't know which is more/less appropriate, so I did both.

There's an obvious potential for measurement error, but the later instance would have to be off by several pixels (on both dimensions) to hit 1.531x.

Probably not a bad time to start lining up stacks against the edges of the Vega 10 die, but I'm not sure how telling that would be since it's just an artist's rendition.
 
They are going with Intel because of Thunderbolt 3. These Xeons will have lower single thread performance than most desktop Ryzen or Sky-X CPUs, mainly due to lower clock speed.

Does TB3 really need CPU support though? Couldn't they just make a TB3 controller out of a number of PCIe lanes?
 
I've got a die shot of Navi right here:

2zfjpr9.jpg

I didn't know the people from the leaker (distinct from leakage) community posted here on b3d. You will become famous for this one!

On a more serious note, it looks like Vega 10 will be cut with the same ratios as Fiji (1/8th disabled).

How soon are these new macs releasing?
 
Does TB3 really need CPU support though? Couldn't they just make a TB3 controller out of a number of PCIe lanes?
No, you can't make TB3 controller out of PCIe lanes since it's not PCIe, but yes, they could support TB3 on Ryzen if they used separate Intel controller for it on the motherboard. Next year others can actually make their own controllers too without royalties or licensing fees, too.

How soon are these new macs releasing?
December (probably due the 18-core CPU option, consumer version of said chip isn't necessarily coming even this year (ASUS rep posted on their ROG forums that 18-cores aren't coming 'till next year and later edited it to "later this year")
 
On a more serious note, it looks like Vega 10 will be cut with the same ratios as Fiji (1/8th disabled).

How soon are these new macs releasing?
iMac Pro will release in December this year.

Nvidia didn't even list FP16 rates for Volta that I've seen, so the assumption is they don't support the packed math like prior architectures and Vega.
Yes they did.
Same capabilities as GP100, so 30 TFLOPS for the Tesla V100
 
Last edited:
Back
Top