Abysmal? At worst it lost just a tad and even then probably mostly due lower memory bandwidth (no, they didn't compensate for that), at best it beat Fury X silly, I'm not sure how that's "abysmal"
You could if you established an approximation of an object and quickly transformed that. Culling entire patches, strips, etc at a course level if they were nowhere close to in the scene. The application should do that, but better to assume otherwise as that would be a huge win.You can't cull generically before tessellation. You don't have any triangles until after DS.
Always the x2 with Infinity! Would actually leave Vega in it's comfort zone. Besides, why even bother with all the PCIE lanes on Threadripper if you don't intend to fill them with GPUs? Or 2P Epyc with a stupid configuration.It would have been cool if Vega were an amazing product that forced NV to bring GP100 to us. Ryzen vs Intel style apocalyptic stuff. Maybe if it had been a bit faster than GP102.
Anyone know of anyone that tested RX Vega vs Fury X at same clock speed? I know Gamersnexus did the test with Vega FE (and those results were abysmal).
Computerbase did, 6% improvement, they also tested with HBCC on and off, no improvement when averaged over,
https://www.computerbase.de/2017-08/radeon-rx-vega-64-56-test/7/
The card looks to be bandwidth starved, most of the people are recomending to leave the core alone and just go for the memory overclocking. The MSAA performance is also bad which seems to be the reason why AMD is ahead in a game at one site while behind in another.
From FE overclocking those were a fluke as the card starts inserting NOPs and otherwise idling. So clocks go higher and performance decreases. Have to actually test something to confirm the overclock is productive. Pascal does something similar.Some people are getting 19xx on core overclocks, not sure if a bug or not.
edit - Almost 2Ghz,
I'm guessing it messed up the tile size calculation. A setting that could be off even without AA adding samples and spilled to RAM.It might reflect a change in the burden it places on other parts of the GPU besides the memory bus, like the tiling method, L2, or perhaps how optimistic the primitive shader can be for culling.
Source?From FE overclocking those were a fluke as the card starts inserting NOPs and otherwise idling. So clocks go higher and performance decreases. Have to actually test something to confirm the overclock is productive. Pascal does something similar.
LN2? TDP = 91wSome people are getting 19xx on core overclocks, not sure if a bug or not.
edit - Almost 2Ghz,
http://www.3dmark.com/fs/13372488
If the card is running at the limit of performance due to power related throttling, then increased bandwidth should make nearly no difference to performance.Are we sure that its bandwidth starved?
Because GN Vega 56 review suggests otherwise. They get 12% increase from power limit increase. Then they overclock HBM by nearly 20%, which results in a pathetic 3.6% improvement. That's not memory bandwidth bound.
In Vega running the primitive shader is not required. You can still run a VS like with previous architectures.What I'm saying is even a vertex shader gets turned into a primitive shader by the driver, therefore primitive shaders are almost always used. Unless you find a way to skip it.
They coexist though a primitive shader can make the culling further down the pipeline unnecessary depending on if the algorithms match exactly.Does Vega still have the primitive discard accelerator introduced in Polaris or would the primitive shaders (in theory) make that obsolete?
I am aware of Rys Tweet, but I also see the footnotes in the architecture whitepaper and Computerbase's assertion.
I'm taking that to be AMD ultimately makes everything a primitive shader for the purpose of driver side optimizations. Even if only performing the standard pipeline functions. Therefore primitive shaders are always enabled. I don't recall Mantor mentioning primitive shaders specifically in that interview, only laying out an optimization (deferred attribute interpolation) that would only work with a primitive shader. So I'm deducing primitive shaders are enabled, only usable by AMD currently, and given the shape of everything, not optimal. I realize I'm hedging a lot there, but that's what's been presented that I've seen. Like I said above, it's not clear where the stages begin and end as they don't follow the standard pipeline structure. In the Linux driver they were creating giant monolithic shaders spanning many steps, so primitive and DSBR may very well be the same shader.
I would be the last one to argue that in case of Vega FE's launch driver and with a very high probability also for currently available drivers for RX Vega.DSBR, according to the whitepaper, works with the Energy benchmark. So there's at least that.
To be fair, tell us where you could find a Vega 56 already. I want one too!welp the 56 is cheaper than i can find a 1070 for and is faster […]
Buildzoid on Reddit is where I saw it, but a few other accounts as linked above. He took a FE straight to 1900MHz with no performance impact. Only have my phone atm so can't find it.Source?
I realize that, but there seems to be a distinction between what optimizations are enabled. If the most basic primitive shader is a simple merger of the first two stages then primitive shaders are "enabled". Those shaders with the 17 primitives per clock, improved culling, deferred attributes, etc may not be. Hence we see conflicting results based on that definition. Being enabled for only one test is a rather weak example of enabled, but does meet the criteria. So one person gets a yes, another a no, but "no" in regards to increased primitive rate or other ability.I am aware of Rys Tweet, but I also see the footnotes in the architecture whitepaper and Computerbase's assertion.
This is interesting. I have never heard of anything like this happening on NVIDIA cards (or any other cards for that matter), though I do remember reading that GDDR5 had a feature like that.From FE overclocking those were a fluke as the card starts inserting NOPs and otherwise idling. So clocks go higher and performance decreases. Have to actually test something to confirm the overclock is productive. Pascal does something similar.
ComputerBase's testing on overclocking memory alone, core clock alone, or both seems to indicate that overclocking HBM2 alone produces significant gains across 17 gamesAre we sure that its bandwidth starved?
Fair enough, but it is still gaining more from memory overclocking than core overclocking. Do you not think that Vega is bandwidth starved then?They use the 150% Power Target for OC, so you'd have to select the RX Vega 64 Max as baseline.
Based on that, two titles score +6 and +7% respectively (Titanfall 2 and Watch Dogs 2) the rest 1-5%.
Given the fun I had with using and trying to apply wattman settings and have the card actually behave accordingly, I would reserve my personal judgement if I may until that tool works as intended for Vega.Fair enough, but it is still gaining more from memory overclocking than core overclocking. Do you not think that Vega is bandwidth starved then?
Fair enough, but it is still gaining more from memory overclocking than core overclocking. Do you not think that Vega is bandwidth starved then?