AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Clukos · Jul 7, 2017

http://www.gamersnexus.net/guides/2979-vega-fe-pro-mode-vs-gaming-mode-whats-amd-doing

Digidi · Jul 7, 2017

3dcgi said:
The Tech Report conclusion is incorrect. Nvidia had a faster culling rate prior to the tiled rasterizer.

If you look at PcgamesHardware the rsullts are the same like Techreport. At culled Polygoneoutput there is no difference between gp102 and gp104

Geeforcer · Jul 7, 2017

Clukos said:
http://www.gamersnexus.net/guides/2979-vega-fe-pro-mode-vs-gaming-mode-whats-amd-doing

"AMD did not respond to a request for comment prior to the writing and publication of this article."

And this is the part where things are starting to get out of hand. This is not a pre-released leak someone broke an NDA on where a company gets to play coy and do the "we don't comment on unreleased hardware" hand-wave. This is an actual, retail product your customers paid you money for. Once that happens, and they raise issues with what you sold them you don't get to climb up into your tree fort and maintain radio silence. "Why does this do what it does and when/whether is it going to do something different" are a perfectly legitimate questions for the paying retail card owners to demand answers to; if you can afford high-concept glossy marketing videos about changing the world you can probably afford to provide some, you know, customer service to your actual customers.

BeepBeep2 · Jul 7, 2017

CarstenS said:
Wait - is that the benchmark, where Vega FE scores 114-ish at gamersnexus? And downclocked to Fury X leel 97-ish and Fury X is at 70-ish? Don't you think something's fishy here?

Hi, first post.
Vega FE in gaming workloads at Fury X clocks seems to perform almost exactly as the Fury X
In specviewperf, GamersNexus showed great improvements in FPS even when downlocked to match Fury X.

There are rumors going around reddit and elsewhere based on statements that NVIDIA made about HBM2 power consumption, that maybe the HBM2 is actually consuming large amounts of power, but it seems that HBM2 power consumption is <40w based on the memory VRM of Vega FE - OnSemi NTMFD4C86N which is only going to do about 25-30 amps @ 1.2-1.3v or <40w.

I am not a programmer but I did get *rekt* on reddit after looking at the Linux source code which obviously doesn't include a lot of raster/render code and making observations on that, after Rys' comments about the "Fiji" meme.
To me, it seems there are some sort of translation tables(?) within the source code for GFX9 where a lot of functions(?) have been renamed while retaining similar and adding new functionality.

CarstenS · Jul 7, 2017

BeepBeep2 said:
Hi, first post.
Vega FE in gaming workloads at Fury X clocks seems to perform almost exactly as the Fury X
In specviewperf, GamersNexus showed great improvements in FPS even when downlocked to match Fury X..

Welcome!
And even that Maya score gets handily beaten by a meager GTX 1080, as the THG benchmark Tottentranz referred to shows. Hence: fishy.

CarstenS · Jul 7, 2017

Graphics and Compute Preemption as reported to DirectX is currently the same for Vega and Polaris: Primitive/DMA-Buffer.
Note that this can (and has been) change(d) with driver revisions.

Malo · Jul 7, 2017

Genotypical said:
is this where they said it wouldnt be available that week? because, if so, he was talking about computex.

lol I think it might have been. I thought I checked my reference but apparently I didn't. Here's hoping for RX Vega ordering after SIGGRAPH! I wonder if that means reviewers will be getting their cards before or during SIGGRAPH?

3dcgi · Jul 7, 2017

mczak said:
Ever since they have distributed setup, to be exact (with the "polymorph engine", starting with fermi). (The tiled rasterizer would not help in any case for that.)
FWIW gp102 is a bit of an anomaly as it shows no scaling over gp104 with the culled polygon throughput test (which I think is what techreport must have been using). Since the theoretical culled throughput is nominally simply 1/3 tri per clock per smm, suggesting it hits another limit on gp102.

I suspect a global primitive distributor at the front of the pipe didn't scale. This likely fetches indices and forms primitives. Also, Nvidia has claimed 1/2 a tri per SM for some parts so apparently it can be 1/3 or 1/2.

Digidi said:
If you look at PcgamesHardware the rsullts are the same like Techreport. At culled Polygoneoutput there is no difference between gp102 and gp104

I wasn't commenting on the performance results. Only the conclusion about the tiled rasterizer being relevant.

CarstenS · Jul 7, 2017

3dcgi said:
I suspect a global primitive distributor at the front of the pipe didn't scale. This likely fetches indices and forms primitives. Also, Nvidia has claimed 1/2 a tri per SM for some parts so apparently it can be 1/3 or 1/2.

Do you remember which ones had a 2-cycle per VTF? I was only aware of one VTF every 3 cycles per SM.

3dcgi · Jul 7, 2017

CarstenS said:
Do you remember which ones had a 2-cycle per VTF? I was only aware of one VTF every 3 cycles per SM.

I remember Kepler being 2 cycle and first gen Maxwell (750 Ti) being 3 cycle. It seems to me that 2nd gen Maxwell went back to 2 cycle. Without locking the clocks it's tough to know the clock rate for synthetics, thus it's tough to estimate how many operations are performed per clock.

3dcgi · Jul 7, 2017

Here's an old post by mczak discussing Kepler and first gen Maxwell.

https://forum.beyond3d.com/threads/nvidia-maxwell-speculation-thread.50568/page-65#post-1538196

mczak · Jul 7, 2017

3dcgi said:
I remember Kepler being 2 cycle and first gen Maxwell (750 Ti) being 3 cycle. It seems to me that 2nd gen Maxwell went back to 2 cycle. Without locking the clocks it's tough to know the clock rate for synthetics, thus it's tough to estimate how many operations are performed per clock.

I always thought Fermi is 1 tri every 4, Kepler 1 tri every 2, and Maxwell/Pascal 1 tri every 3 cycles. That said, I got that from what Damien wrote, and indeed starting with 2nd gen Maxwell the chips seem to exceed that rate. Now the theoretical rate wasn't mentioned in the gtx 980 article, but there was no hint it would be different to first gen maxwell (ok the marketing actually says maxwell 1 is polymorph engine 2.0, same as kepler, which doesn't make much sense, whereas gm2xx is polymorph engine 3.0, but I wouldn't really give those marketing terms any credibility).
In any case, whatever the rate is, the important thing is really the near perfect scaling with SM count (usually, with the exception of gp102 and mostly gm200).

BacBeyond · Jul 7, 2017

Rys said:
I've never gotten round to polishing it enough for regular use. One day this year, I'm just not sure when.

Throw it up on github? Let people polish it for you

Rys · Jul 8, 2017

BacBeyond said:
Throw it up on github? Let people polish it for you

I can't, chunks of the framework it uses are licensed and the license I have doesn't let me do that. I'll start a new thread for it nearer the time.

BacBeyond · Jul 8, 2017

Rys said:
I can't, chunks of the framework it uses are licensed and the license I have doesn't let me do that. I'll start a new thread for it nearer the time.

Roger, I've been trying to find it for a few years since I saw it used on TechReport's site but could never find it anywhere so now at least I know why

. Hope to see it in the future!

Clukos · Jul 11, 2017

Pressure · Jul 11, 2017

Clukos said:

Can you give some context to the video?

DavidGraham · Jul 11, 2017

Pressure said:
Can you give some context to the video?

The card throttles to both temperature and power, @1600MHz the card reaches 375w power consumption quickly and throttles down to a lower clock to reduce power, which probably means it needs even more power than that to sustain 1600MHz for longer periods. Also increasing the clocks gives some diminishing returns.

Rootax · Jul 11, 2017

375w at 1600 ? Holly...

BacBeyond · Jul 11, 2017

Clukos said:

DavidGraham said:
The card throttles to both temperature and power, @1600MHz the card reaches 375w power consumption quickly and throttles down to a lower clock to reduce power, which probably means it needs even more power than that to sustain 1600MHz for longer periods. Also increasing the clocks gives some diminishing returns.

Rootax said:
375w at 1600 ? Holly...

I wouldn't pay much attention to the power draw, as he said the voltage wasn't working properly and because he had to put it at 50% to get the clocks to stick properly his stock settings are faster and use way less power.

Stock (1440?): 6701 @ 235w

Manual OC (1400): 6650 @ 346w

Thats 111w more for a lower clock and score. So clearly his OC settings are causing the massive power draw, not the card's normal functions.

AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Clukos

Bloodborne 2 when?

Digidi

Geeforcer

Harmlessly Evil

BeepBeep2

CarstenS

Moderator

CarstenS

Moderator

Malo

Yak Mechanicum

3dcgi

CarstenS

Moderator

3dcgi

3dcgi

mczak

BacBeyond

Rys

Graphics @ AMD

BacBeyond

Clukos

Bloodborne 2 when?

Pressure

DavidGraham

Rootax

BacBeyond