On Ati's "optimization"

Pete said:
Tagrineth said:
Nobody's even suggested testing cat3.5 and/or cat3.6 on old versions of 3DMark03 to test that the optimisation really is out of the driver. 8)
If you meant 3.2 and 3.3, I have. :)

3.2 and 3.3 should both have the optimisation, as should 3.4.
 
Woo-hoo!!!

I am SO glad that ATi came out with that statement, and on the same day as the accusation/inference/general questioning! I was really worried about it when I first read about ATi having questionable results with gametest 4...but having them come out and acknowledge it without skirting it too much is about as good a response as I think any company could make!

Thanks ATi, thank you very much. :)

OT, but how exactly is nVidia claiming best performance in Doom3 when there ain't any benchmark available for it yet to anyone but them? It don't seem exactly right, nor verifiable. :( (Johnny-boy, do you SEE the trouble your sell-out is gonna cause?!?!!? :( )
 
...a program which costs of hundreds of thousands of dollars to participate in...

This appears to be one of the main planks of Nvidia corporate spin on the 3DMark03 fiasco & has been much bandied by them. It is, of course, utter hogwash... As a publicly listed entity with their revenue/profit stream, & hence, contribution to the IRS, even 0.2M would be a trivial & legitimate business deduction. They spend as much on an afternoon's stripper (s)extravaganza & is tantamount to Nvidia crying poor...;)
 
whatever firm they hired to do their PR is making them look worse by the minute......LOL ATi handled it with class and nVidia told us to kiss their a$$...But like many others here I am saying No thanks nVidia, I'll pass.
 
Kudos to Ati for the response. Would like to see the same thing from Nv.

As for the fee, Nv was a strategic member, wasn't it? Minimum 5000$ doesn't mean that 100k+$ is impossible :)
 
DaveBaumann said:
ATI's official statement:

The 1.9% performance gain comes from optimization of the two DX9 shaders (water and sky) in Game Test 4 . We render the scene exactly as intended by Futuremark, in full-precision floating point. Our shaders are mathematically and functionally identical to Futuremark's and there are no visual artifacts; we simply shuffle instructions to take advantage of our architecture. These are exactly the sort of optimizations that work in games to improve frame rates without reducing image quality and as such, are a realistic approach to a benchmark intended to measure in-game performance. However, we recognize that these can be used by some people to call into question the legitimacy of benchmark results, and so we are removing them from our driver as soon as is physically possible. We expect them to be gone by the next release of CATALYST.

I'm very pleased with ATI's stance on the issue.

But if I was on the nVidia board, however, I would be hiring a new PR department ASAP. Their statement is - from a professional view - doing not one but two big fundamental no-no's in damage control:

1) They go out of their way not to address the issues in any professional manner, thus missing the opportunity to take at least some control of where the focus will be ongoing. They don't even hand out some of that joker smoke-screen info that would at least buy you time but may even confuse people's understanding of the substance in the charge.

2) They attack the messenger head on, saying that they (Futuremark) is really the one guilty of fiddling and doing the damage to nVidia.

I don't know who the hell made nVidia's statement, but I find it very hard to believe it was anybody with any professional insight in PR matters. We are lokking a high school grade PR-quality here, and if this was the 'defence' for anybody running for public office he/she might as well go out back and shoot themselves. (Yes, it's really that bad).
 
mczak said:
Xmas said:
mczak said:
Yes, but software design also says you don't gain much optimizing at the lowest level (i.e. reordering instructions).
And CPU design shows that this is phenomenally wrong. ;)
Optimizing inner loops very results in huge performance improvements, and a shader is nothing but a single "inner loop".
I wouldn't call it phenomelly wrong, but the gains can indeed be considerable (won't help an selection sort to beat a quick sort though). The potential for optimizing low-level shader code should however be smaller than for cpus I suppose (as you can't do much more than re-ordering probably). But maybe I'm wrong again - it's a bit too late over here and I definitely need some sleep :?

mczak-

If I understand Xmas' point, he's alluding to the fact that the reason small scale reordering of assembly code doesn't buy you much on a modern general-purpose CPU (specifically a current x86 or RISC CPU) is because essentially optimal reordering is done on the CPU itself because they all have out-of-order execution. Of course this isn't really true at all with an in-order VLIW-style microprocessor, where code scheduling is extremely important. (This goes for Itanium, although IPF is a bit more flexible than pure VLIW.)

Anyways, current shader pipelines are way too simple to qualify as any sort of CPU style, RISC, CISC or VLIW; but the ability of R3xx to do vec3 + scalar in parallel does present an optimizing opportunity similar to that of a VLIW (albeit much simpler).

I for one am a bit surprised that ATI's compiler isn't able to catch such opportunities and reschedule for them in the general case; but perhaps the GT4 shader code is "difficult" in some way which fools the compiler in its current state. Or maybe they figured that for now they're better off special-casing every single shader: either through devrel in the case of a game, or through driver cheats in 3dMark.
 
NVidia's maths for $200K is a composite number if you read their statement carefully. License plus cost of employees time to improve our drivers and optimisation of how we handle their shaders to fix this mess FutureMark got us into.

Rather than we have to trade performance for image quality badly if its a Directx 9 test asking for fp24 and we have to go way over it to give fp32 making our perforance suck even more :)

ATi made a brillaint reply - whereas (borrowing from 3dGPU - classesless Jack -

NVidia - The way you're meant to be played :rolleyes:
 
Dio said:
mczak said:
hard time to think this is feasible. In general, that's a much more difficult problem than an optimizing on-the-fly HLSL compiler.
Re-analysing assembly to generate new code isn't significantly harder than doing it from the high-level version.

Depends. If you're for instance receiving a taylor-series expansion of a sine function it can be quite hard to bring it back to a SIN hardware opcode. High-level semantics can be quite useful for the compiler.
 
DaveBaumann said:
ATI's official statement:

The 1.9% performance gain comes from optimization of the two DX9 shaders (water and sky) in Game Test 4 . We render the scene exactly as intended by Futuremark, in full-precision floating point. Our shaders are mathematically and functionally identical to Futuremark's and there are no visual artifacts; we simply shuffle instructions to take advantage of our architecture. These are exactly the sort of optimizations that work in games to improve frame rates without reducing image quality and as such, are a realistic approach to a benchmark intended to measure in-game performance. However, we recognize that these can be used by some people to call into question the legitimacy of benchmark results, and so we are removing them from our driver as soon as is physically possible. We expect them to be gone by the next release of CATALYST.

This has all become an insane witch hunt. Re-arranging instructions seems a a perfectly valid thing to do and I'd be disappointed if there wasn't some form of it in the driver.
 
Humus said:
Dio said:
mczak said:
hard time to think this is feasible. In general, that's a much more difficult problem than an optimizing on-the-fly HLSL compiler.
Re-analysing assembly to generate new code isn't significantly harder than doing it from the high-level version.
Depends. If you're for instance receiving a taylor-series expansion of a sine function it can be quite hard to bring it back to a SIN hardware opcode. High-level semantics can be quite useful for the compiler.
OK, OK you get me there, and there are things like that in existing code.

It's less of an issue with DX9 class input than DX8 class input because the shader language is more capable and some of the edge cases in the spec have been settled in a way that is better for optimisers.
 
Simon F said:
This has all become an insane witch hunt. Re-arranging instructions seems a a perfectly valid thing to do and I'd be disappointed if there wasn't some form of it in the driver.

Agreed. The question is whether the driver identifies opportunities to rearrange instructions based on general-case rules, or whether the driver can only look for one particular shader in 3dMark03 and replace it with a hand-coded alternative. Apparently making a trivial change to the shader is enough to prevent ATI's driver from making the optimization, making it extremely likely this is just a search-and-replace.
 
IMHO

Nvidia is cheating , Ati too.
These are the facts, like it or not. Sure, NV is more "guilty", but I just can't understand why most people are "proud" of ATi 's answer and bash NV . Excuse me, but on another more pro-NV forum I see exactly the oposite - bashing FM for they optimised 3dmark2003 for ATi cards, and ATi for cheating.
IMHO noone in this case is "pure and clean".
Nvidia tries to cheat in order to make NV3x look better in dx9-intensive benches
FM wants money (pay us and we'll optimise...) - which is OK - they have families and noone HAS TO pay- he has choice.
Ati ... tries to not go "in the lights", probably paid more to FM :)
 
DaveBaumann said:
Hardware membership does not have to cost hundreds of thousands.

How do you know that the hundreds of thousands would be ONLY for membership?

Sure, NV is more "guilty", but I just can't understand why most people are "proud" of ATi 's answer and bash NV .

AFAIK people are proud of Ati because they answered the question quickly and never doged the questions or tried to mislead the public.
 
Back
Top