Grid 2 has exclusive Haswell GPU features

nAo · Jul 22, 2013

Davros said:
does oit make msaa more expensive ?

OIT is done by just adding some code to your shader so it interacts with MSAA the same way any other shader does, there is nothing special about it.
It makes more sense to say that MSAA makes shading more expensive since it increases the number of fragments that contribute to a given pixel.

nAo · Jul 22, 2013

Frontino said:
Do Intel HD Graphics EUs increase in bit-width between generations?
Because cpu-world.com http://www.cpu-world.com/info/Intel/Features_of_integrated_Intel_HD_graphics_units.html shows that HD 4000 units can execute 8 threads; does that mean they are 256 bit wide? Why isn't Intel more clear on their GPUs specs?

From IVB to HSW the number of threads per EU when from 8 to 7, see developer guide:
http://download-software.intel.com/...tion_Core_Graphics_Developers_Guide_Final.pdf

Since IVB each EU has two 4-wide SIMD pipelines and can execute up to 16 floating point operations per clock (one 4-wide multiply-add per pipeline).

Frontino · Jul 23, 2013

Thanks for the link.
I'm not getting the 320 and 640 flops/cycle, though. With 7 threads they should be 280 and 560, no?

Pete · Jul 23, 2013

20 EUs * (16 FLOPS / EU) = 320 FLOPS
40 * 16 = 640

FLOPS are based on available hardware (ALUs) per clock. Threads are executed one at a time, not simultaneously. Each EU keeps multiple threads in flight to have one to swap to when the current one stalls (think hyper-threading on Intel CPUs).

Frontino · Jul 23, 2013

Were the Sandy Bridge 6 EU models 16 flops/cycle too?

Pete · Jul 23, 2013

Nope, judging from this "Peak Theoretical GPU Performance" table: 12 for Intel HD 2000/3000 vs 16 for 4x00/5x00.

Andrew Lauritzen · Jul 24, 2013

If I remember correctly, in Sandy Bridge the second pipeline could do mul or add, but not mad... hence the 8+4 = 12/clock. In Ivy Bridge and beyond both can do full 4-wide mad.

Rakehell · Jul 26, 2013

Davros said:
Just a heads up
Grid 2 supports avx, and avx exclusive features seem to be advanced blending and smoke shadows

So that's how they were able to get "comparable framerates" to the GT 650M.

Andrew Lauritzen · Jul 27, 2013

Rakehell said:
So that's how they were able to get "comparable framerates" to the GT 650M.

Read any other post in the thread maybe... although I do enjoy the odd demonstration of fanboy-level confirmation bias

Rakehell · Jul 30, 2013

Andrew Lauritzen said:
Read any other post in the thread maybe... although I do enjoy the odd demonstration of fanboy-level confirmation bias

Saying Intel GPUs are slow makes one a fanboy? You're on the defense tonight.

Andrew Lauritzen · Jul 30, 2013

Rakehell said:
Saying Intel GPUs are slow makes one a fanboy? You're on the defense tonight.

Except that's not what you said at all... you replied to the original post in the thread that was completely wrong (AVX? lol) and implied there was some sort of special performance optimization going on only on Intel which is blatantly untrue.

In reality, I'm pretty sure you already have the position that "Intel GPUs are slow" and you just look around for anything that confirms that position and ignore anything else. In this case it was just particularly funny since the information you found was incorrect/unrelated (and didn't even make any sense) but you didn't even read the very next post before assuming it was true since it fell in line with what you want to think.

If you want to learn, please feel free to read the thread and ask any questions. If you're just going to mess up a good technical discussion with ignorant comments, there's always neogaf

But maybe I misunderstood your first post here. Happy to be convinced by your next few that I'm wrong about you and you have something interesting to contribute

Paran · Jul 30, 2013

From Research to Production, How AVSM and AOIT made their way into games: http://software.intel.com/sites/default/files/From-Research-to-Production-final.pdf

AOIT Sample: http://software.intel.com/en-us/blo...ency-approximation-with-pixel-synchronization
AVSM Sample: http://software.intel.com/en-us/blogs/2013/03/27/adaptive-volumetric-shadow-maps

AOIT seems to work mainly on foliage in Grid 2 and in the sample tool only on foliage. Chainlink fences as well in Grid 2. AOIT theoretically works on all kind of transparency textures or is there a limitation?

Andrew Lauritzen · Jul 30, 2013

Paran said:
AOIT seems to work mainly on foliage in Grid 2 and in the sample tool only on foliage. Chainlink fences as well in Grid 2. AOIT theoretically works on all kind of transparency textures or is there a limitation?

Yes it works on anything you want to blend. Just in Grid 2 foliage and chain link fences were what they typically do with alpha test/alpha-to-coverage and AOIT provides a much nicer image than those.

Games tend to be designed to minimize blending *because* of the OIT problem. Going forward I imagine techniques like AOIT will allow artists to use more blended things than they have been able to in the past. The response from everyone so far has been that they really enthusiastically want/need these features and would like to see them on other platforms too (specifically the new consoles), so I think we'll gradually see other hardware support them too.

Paran · Jul 30, 2013

This would be ideal for deferred lighting games without proper MSAA support. OIT would help for all the flickering foliage and other transparency stuff and for polygon smoothing some PP-AA would do it, preferably PP-AA with high detail preservation like SMAA. Or 2xSSAA combinated with OIT, although this wouldn't be useful for integrated graphics. A shame that Nvidia and AMD don't support OIT. Microsoft should make this mandatory in a future directx revision.

3dcgi · Aug 3, 2013

Paran said:
This would be ideal for deferred lighting games without proper MSAA support. OIT would help for all the flickering foliage and other transparency stuff and for polygon smoothing some PP-AA would do it, preferably PP-AA with high detail preservation like SMAA. Or 2xSSAA combinated with OIT, although this wouldn't be useful for integrated graphics. A shame that Nvidia and AMD don't support OIT. Microsoft should make this mandatory in a future directx revision.

Any DX11 part supports OIT. They just don't support it in exactly the same way.

Paran · Aug 3, 2013

3dcgi said:
Any DX11 part supports OIT. They just don't support it in exactly the same way.

Sure but nobody does it over DX11.

nAo · Aug 3, 2013

Paran said:
This would be ideal for deferred lighting games without proper MSAA support. OIT would help for all the flickering foliage and other transparency stuff and for polygon smoothing some PP-AA would do it, preferably PP-AA with high detail preservation like SMAA. Or 2xSSAA combinated with OIT, although this wouldn't be useful for integrated graphics. A shame that Nvidia and AMD don't support OIT. Microsoft should make this mandatory in a future directx revision.

That's not quite correct.
Any DX11 GPU allow to "record" all fragments that contribute to a pixel into a variable size data structure (e.g. a list). Once you have such data you can do pretty much whatever you want with it, including sorting it and compositing it for OIT.

The main drawback of these methods is that the more transparent stuff you render the memory you need, so that it's hard to determine how much memory one should allocate in advanced for it.
Too much and you waste it, too little and parts of your transparent geometry won't appear on the screen. Also sorting a lot of fragments per pixel can be inefficient and generate not-so-predictable & stable performance.

Alternative methods based on pixel synchronization (we developed one, but I am sure ISVs will come up with many others) allow to compute an approximate OIT solution as you render the transparent geometry into a fixed sized memory buffer, which makes the algorithm use a known amount of memory (e.g. 16 bytes per pixel) and also provides predictable/stable performance. Changing the amount of memory one allocates for each pixel makes possible to trade off image quality for performance (i.e. higher quality -> lower performance).

Paran · Aug 3, 2013

nAo said:
That's not quite correct.
Any DX11 GPU allow to "record" all fragments that contribute to a pixel into a variable size data structure (e.g. a list). Once you have such data you can do pretty much whatever you want with it, including sorting it and compositing it for OIT.

I haven't said it wouldn't be possible on DX11, I was under the impression that it wouldn't make sense for efficiency/performance reasons and hence why no dev did go this route.

Andrew Lauritzen · Aug 4, 2013

3dcgi said:
Any DX11 part supports OIT. They just don't support it in exactly the same way.

Yeah but that's like saying any GPU supports ray tracing by rendering 1x1 viewports

3dcgi · Aug 4, 2013

Andrew Lauritzen said:
Yeah but that's like saying any GPU supports ray tracing by rendering 1x1 viewports

Obviously that's hyperbole as there are interactive demos with OIT and ray tracing. At least one workstation app implemented OIT as well. Just noting it for others that don't know.

It would have been interesting if GRID 2 supported a vanilla DX11 method so we could see the performance difference. Even if it's significantly slower (and I'm not convinced as to the level of significance yet) high end cards can brute force it.

Grid 2 has exclusive Haswell GPU features

nAo

Nutella Nutellae

nAo

Nutella Nutellae

Frontino

Pete

Moderate Nuisance

Frontino

Pete

Moderate Nuisance

Andrew Lauritzen

Moderator

Rakehell

Andrew Lauritzen

Moderator

Rakehell

Andrew Lauritzen

Moderator

Paran

Andrew Lauritzen

Moderator

Paran

3dcgi

Paran

nAo

Nutella Nutellae

Paran

Andrew Lauritzen

Moderator

3dcgi

Similar threads