Sir Eric Demers on AMD R600

Status
Not open for further replies.
well why use VLIW then? I see where that could be good for GPGPU, but games? Does it make it very complex to optimize for...
 
well why use VLIW then? I see where that could be good for GPGPU, but games? Does it make it very complex to optimize for...
in all fairness the vast majority of GPUs architectures out there are VLIW, G8x is an exception
 
FarCry uses FP16 blending, not sure about FP16 filtering. R5xx doesn't support FP16 filtering, but it runs FarCry with HDR enabled without any special patch (which will e.g. do FP16 fitlering through a shader). I'm not 100% sure, but every article stated, that FP16 blending was required - not a single mention of FP16 filtering.
Far Cry does take advantage of FP16 filtering if it's available. When it's not available, it does the filtering in the shaders.
 
http://www.anandtech.com/video/showdoc.aspx?i=3029&p=7


here says that even HD2900XT doesn't have hardware to do well in Directx10 and also that geometry shaders doesn't perform well with the GPU

true ?


well , I though that R600 was a lot more powerful than G80 in GS...

It`s a lot more powerful doing amplification, because what the G80 is doing ATM in that scenario is quite...umm...horrible. Whether or not that is merely a driver bug or a feature of the G80 is unknown ATM. There are more uses to the GS than amplification. And one must also factor in how much GS work is in a given scene....because ATi can be 1000x faster at it, but in a scene where the workload is composed of only 1% GS work, and the rest 99% is slower on the R600, the advantage is useless.
 
in all fairness the vast majority of GPUs architectures out there are VLIW, G8x is an exception


Gf6 and up haven't for nv and so far nv's cards seem to have had better utilization of their alu's since they dropped it, this could be due to the way shaders were written but I don't think so since the r420 should have had a much ppronounced advantage if it was.
 
ATIs chips, including R4x0, have a notable advantage in Oblivion that I've always been curious about. NV40 really bit the dust hard in that game, for some reason. G70 wasn't hugely better, either.
 
It`s a lot more powerful doing amplification, because what the G80 is doing ATM in that scenario is quite...umm...horrible. Whether or not that is merely a driver bug or a feature of the G80 is unknown ATM. There are more uses to the GS than amplification. And one must also factor in how much GS work is in a given scene....because ATi can be 1000x faster at it, but in a scene where the workload is composed of only 1% GS work, and the rest 99% is slower on the R600, the advantage is useless.

but , DX10 real games will use GS in an intensive way ,right ? so , doesn't it mean that R600 could keep up with these games a lot better than G80 ?


ps:I know that DX10 real games will take a lot of time to come
 
ATIs chips, including R4x0, have a notable advantage in Oblivion that I've always been curious about. NV40 really bit the dust hard in that game, for some reason. G70 wasn't hugely better, either.


I thought that was more due to the different HDR modes, don't have any of those older cards to do tests on though lol
 
but , DX10 real games will use GS in an intensive way ,right ? so , doesn't it mean that R600 could keep up with these games a lot better than G80 ?


ps:I know that DX10 real games will take a lot of time to come

GS will obviously get used within DX10, but I'd wager that geometry isn't going to equate to any more than 10% of the full GPU load for a given scene. So, the point still stands -- you can be 1000% percent faster at a certain task, but if that certain task is only a small portion of the entire workload it's not going to get you very far.
 
GS will obviously get used within DX10, but I'd wager that geometry isn't going to equate to any more than 10% of the full GPU load for a given scene. So, the point still stands -- you can be 1000% percent faster at a certain task, but if that certain task is only a small portion of the entire workload it's not going to get you very far.


I said that based on what Humus said here :

The R600 spanks the G80 seriously in the geometry shader. The GS is essentially useless on G80 as performance drops exponentially as you increase workload. We've measured R600 to be up to 50 times faster in some cases.


50x is in the worst case scenario for G80. The G80 is sort of saved by the upper limited of the GS output limit. If you could output more than 1024 scalars chances are the gap would be even bigger. Essentially the deal is that if you do things very DX9 style, the G80 can keep up, but for DX10 style rendering, like if you output more than just the input primitive, performance starts to drop off at an amazing rate. You don't need much amplification for it to become really bad. In real world cases it might not be 50x, but you'll probably see much larger deltas than you're used to. Like you'd see maybe 2-3x gap in dynamic branching in the previous generation in the best case, but for the GS you'll probably find real world cases where the delta is more like 5-10x.
 
Bear in mind that Humus, whilst being a very very nice guy, happens to work for ATi, so his projections are made in the context of how ATi sees the future. Just like Sir Eric will always say that the ALU load will hit sky-high levels, whilst texturing load will stall/go up only slightly. I certainly appreciate and respect both of them, but their predictions are made according to the mindset that dominates ATi and they don`t necessarily translate 1:1 into the real world.

Again:it`s fairly irrelevant to look at these aspects in isolation. Outside of tech-demos, it`ll always be a melange. Geometry amplification will not be used extensively for quite a while, because even if we have relative numbers between IHVs, we have jack absolute performance numbers when doing amplification, and I have a hunch that they`re not that extraordinary for first-gen hardware. Also, it`s unrealistic to expect a huge GS load in nearish future titles, simply due to the fact that you can`t really downgrade GS functionality to DX9, and devs have to actually sell software, not write tech demos with the newest stuff in:). I do disagree with Anand(or whoever wrote that review) making assumptions about GS/overall DX10 handling by both IHVs from this limited dataset though. This seems to be a very painful birth for both sides.
 
I said that based on what Humus said here :
Yes, and I'm not contesting what Humus wrote. The 2900HD's geometry shading capabilities are far superior to the G8x line.

That's not the point.

The point is, if the GS work only constitutes 5% of the entire frame render time, then it could be a billion times faster than the competition but the overall performance advantage would be no more than a scant few percent. Get what I mean?

Here, let me help. Let's break down the percentage of time needed for each "function" in a hypothetical frame render in a game: (this is complete BS, but I want to make sure you're understanding what I'm saying)

3% Geometry Shader
12% Vertex Shader
49% Pixel Shader
18% Texturing
18% Z and culling

There. So, if your card spends 3% of it's time working on the Geometry shader, no matter HOW FAST that Geometry shader might be, it's not making up for the other 97% of the frame. Get it now?
 
Even if the major pack of developers are willing to exploit the most of the GS potential at some point, the fact that one of the two major IHV's product have a kind of "questionable" performing DX10 path [GS], and it happened that this particular IHV has a significant market presence leadership, it won't be odd if next-to-no-one will bet on heavy usage of GS amplification, just from plain marketing considerations... :rolleyes:
 
Even if the major pack of developers are willing to exploit the most of the GS potential at some point, the fact that one of the two major IHV's product have a kind of "questionable" performing DX10 path [GS], and it happened that this particular IHV has a significant market presence leadership, it won't be odd if next-to-no-one will bet on heavy usage of GS amplification, just from plain marketing considerations... :rolleyes:

Well, quite true. If only one vendor has acceptable GS performance, how much time will developers spend writing such code? I mean, if you write a ton of GS stuff into your app and 85% of the hardware it runs on will run it at unacceptable speeds, then why even bother?

By the time GS usage catches on because it's widely available at performance-acceptable speeds by multiple vendors, the 2900HD will be "old".
 
I mean, if you write a ton of GS stuff into your app and 85% of the hardware it runs on will run it at unacceptable speeds, then why even bother?
You are bad prophet, aren't you? :LOL:

untitled1iu1.png


This is the most recent survey from the Steam site, although catching just a fraction of the total user-base, but it should be enough representative for the trends.

Well, that or either way, this picture draws bad name for the whole DX10 market galore at all, though... :cool:
 
Status
Not open for further replies.
Back
Top