AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Well if its not what you meant I'd like to know, since I've never heard of performance problems with translucent geometry on deferred.

I'm not a professional programmer, and I don't work in the games industry, so I could easily be wrong. My general understanding was that handling transparency was one of the tradeoffs of the deferred approach. Maybe it's not a performance issue. You can correct me.
 
You couldn't have a rendering technique in a game that can't cope with transparencies... It wouldn't work. Besides, I heard it said at the time that DOOM's renderer was like, semi-deferred. Pseudo? Half deferred? :LOL:

Something like that anyway.
 
I'm not a professional programmer, and I don't work in the games industry, so I could easily be wrong. My general understanding was that handling transparency was one of the tradeoffs of the deferred approach. Maybe it's not a performance issue. You can correct me.
Performance vs accuracy trade-off. Forward faster and deferred required to hold many overlapping samples to compute accurately.
 
I'm not a professional programmer, and I don't work in the games industry, so I could easily be wrong. My general understanding was that handling transparency was one of the tradeoffs of the deferred approach. Maybe it's not a performance issue. You can correct me.
I just experiment a little myself and have stuck to forward rendering. Which is why I'm interested in what you heard, I've only read about deferred techniques and was hoping to learn something new.

and deferred required to hold many overlapping samples to compute accurately.
What technique is this, I'm only familiar with a separate pass for alpha on a separate buffer?
 
What technique is this, I'm only familiar with a separate pass for alpha on a separate buffer?
Roughly Intel's PixelSync, where x closest samples are stored and subsequent samples compressed into the bottom. What's occurring with the order independent transparency. More of a blending technique than forward/deferred.
 
What technique is this, I'm only familiar with a separate pass for alpha on a separate buffer?
One of the rare cases where I have heard deferred path to be used for transparent surfaces is in Inferred Lighting. (Red Faction: Armageddon, Saints Row 3)
http://www.gdcvault.com/play/1014525/Lighting-the-Apocalypse-Rendering-Techniques

Most games just use forward pass for transparent surfaces. (Forward+ is coming quite common as it allows proper lighting.)
For particle lighting separate lighting cache/texture pass is becoming quite common as well. (Doom)
 
Last edited:
So, HBCC is not only usefull when vram is saturated, but, it seems that it's more globally a better way to manage memory, vram full or not ?
 
Not sure what can really be made of that demo.
As an example a real-world game actually performed worst with 12GB HBCC (4GB system RAM) and required 16GB HBCC (8GB system RAM) to gain 2-8% fps on average depending upon resolution over normal operation, further impacted by resolution: https://www.computerbase.de/2017-10...diagramm-radeon-rx-vega-64-mit-hbcc-2560-1440
I think demo's could be comparable to what was seen with Async Compute, in theory and some demos the gains were impressive but in reality it works out around on PC 4% to 10% (at best) and again unfortunately not consistent, still minimum of 5% and at times 10% is not something to dismiss as it all helps.
In fact some games give a counter-intuitive result and HBCC performance gains are weaker at 4K compared to 1080P: http://www.pcgameshardware.de/Mitte...erde-Schatten-des-Krieges-Benchmarks-1240756/
Even if some of this can be rectified (whether driver or games patched), it does unfortunately mean inconsistent behaviour.

For general relevance and possibly just as important for HBCC; how many focused on gaming buy 32GB of 3200Mhz+ DDR4 memory, more of a headache if going with Ryzen as it really needs that high clock low-tight timings memory and that is quite a painful price.
I guess it is possible to get by with 16GB DDR4 but setting aside 8GB of that for HBCC does reduce system memory resources to a possible threshold that may experience problems (context what is running on the PC, memory leakage/reserved/etc).

I think HBCC is a good concept and every little bit of performance helps, but I do wonder for real-world gains and practicality whether a PC will then need a minimum of 24GB as there is no such perfect system environment when it comes to domestic gaming PC for a general user.
 
One of the rare cases where I have heard deferred path to be used for transparent surfaces is in Inferred Lighting. (Red Faction: Armageddon, Saints Row 3)
http://www.gdcvault.com/play/1014525/Lighting-the-Apocalypse-Rendering-Techniques

Most games just use forward pass for transparent surfaces. (Forward+ is coming quite common as it allows proper lighting.)
For particle lighting separate lighting cache/texture pass is becoming quite common as well. (Doom)

Maybe, a good place for find this is on the Forward+ presentation from AMD or in GPUOpen https://gpuopen.com/ .. but i have not much time for watch it right now.
 
HBCC gains were usually seen on Superposition 1080p extreme bench in the range of 10-15% while much less on 4k bench. I tested it yesterday with the latest driver and now I get HBCC on results with it off as well.

So it's doing something that helps frame rate when you're not even close to being vram limited.
 
Comparing a 16GB Vega FE and 8GB RX with various combinations of HBCC may help confirm if the perceived capacity of video memory can influence the outcome.

HBCC could help smooth out rough spots where the resource allocation strategy is not optimized or conservative with respect to memory capacity. Increasing what the software thinks it has available may allow for a more optimistic allocation, and HBCC could paper over a certain number of internal misses.

Part of the trade-off for what is allocated is the heavier synchronization for a high-level transaction for pulling in resources. HBCC's capacity may lengthen the time between the points the software is forced to incur the overhead. Potentially, the cache-like management by the hardware can take various shortcuts due to assumptions it can make that requests from a higher level abstraction might not--as long as its automatic behavior is correct enough of the time.

I am not clear if HBCC can allow for the GPU to be more aggressive about indicating to software that a resource has been loaded, effectively pretending a barrier is sufficiently close to being satisfied and counting on the paging functionality to cover actual misses. That could be a separate consideration to the capacity the software thinks it has, although it might incur some of the heavier overheads in those higher-level transactions unless the GPU/driver are able to intercept parts of the process.
 
OK first benchmarks are up, Wolfenstien 2 supports FP16, Deferred Rendering and GPU Culling. The last two do practically nothing but hurt performance on both AMD and NVIDIA GPUs. The game has also support for Async Compute, which benefits both Pascal and Vega, but is unavailable for Maxwell.

As for performance, NVIDIA GPUs hole hold the upper hand in the taxing scenes, GTX 1080 is 20% faster than Vega 64, while 1070 is 10% faster. In less taxing scenes, Vega cards pick some gains and Vega 64 becomes ~10% faster than GTX 1080. This discrepancy in performance confused ComputerBase, and there is no other solid explanation for it yet. It's also worth noting that NVIDIA still didn't release a game ready driver yet, this could be the source of the discrepancy.

https://www.computerbase.de/2017-10...nstein-2-3840-2160-anspruchsvolle-testsequenz

EDIT:
PCGH numbers are up, Vega 64 = GTX 1080 according to their test sequence.
http://www.pcgameshardware.de/Wolfe...us-im-Technik-Test-Benchmarks-Vulkan-1242138/
 
Last edited:
Back
Top