5700, 5600, 9600 'Pure' DX9 Performances

DemoCoder said:
DaveBaumann said:
Despite the improved PS2.0 performance they are still quite significantly behind ATI's "Pure" DX9 performance, and in the shadermark tests the new MS 2_a compiler for the FX series isn't making much of a difference - this likely suggests that the compiler optimiser that is now in the 52.16 drivers is already getting close to the performance of the HLSL compiled code in the first place, not not their optimal performance for Sahder Assembly reordering. Despite the 5700 being a new chip, entirely designed an built after DX9 was finalised they haven't altered the FX architecture at all do improve some of the missing areas - still no float buffer support and still no MRT's etc.


I don't think you can draw that conclusion from the data. I've been consistently arguing in these forums over the past year for improved compiler technology, and the fact that the instruction scheduling issues are non-trivial. It seems a majority of people subscribed to the view just months ago that there are no improvements left to NV3x drivers to increase PS2.0 speed that don't involve hacks to IQ.
.

Sorry but who cares what they'll show let say, 12 MONTHS later? :oops: Nobody gives a flying frog what NV MAYBE can do on the shader performance issue of NV3x almost one year after its release.

EDIT: months fixed + grammar
 
T2k said:
Sorry but who cares what they'll show let say, 12 MONTHS later? :oops: Nobody gives a flying frog what NV MAYBE can do on the shader performance issue of NV3x almost one year after its release.

EDIT: months fixed + grammar

Sounds rather narrow minded to me. Especially since there's no real (imo) DX9 game out in the market yet.
 
Bjorn said:
T2k said:
Sorry but who cares what they'll show let say, 12 MONTHS later? :oops: Nobody gives a flying frog what NV MAYBE can do on the shader performance issue of NV3x almost one year after its release.

EDIT: months fixed + grammar

Sounds rather narrow minded to me. Especially since there's no real (imo) DX9 game out in the market yet.

Sounds funny to me. :rolleyes: TRAoD, Halo and toher stuff vs performance on GFFX.
 
T2k said:
Sorry but who cares what they'll show let say, 12 MONTHS later? :oops: Nobody gives a flying frog what NV MAYBE can do on the shader performance issue of NV3x almost one year after its release.

I'd imagine those who bought an FX card would disagree with you.
 
PaulS said:
T2k said:
Sorry but who cares what they'll show let say, 12 MONTHS later? :oops: Nobody gives a flying frog what NV MAYBE can do on the shader performance issue of NV3x almost one year after its release.

I'd imagine those who bought an FX card would disagree with you.

Disagree? That's my point! They need drivers from the neginning, at PRESENT, not in the future!
 
T2k said:
Disagree? That's my point! They need drivers from the neginning, at PRESENT, not in the future!

That wasn't your point. Your point was that we should ignore all performance enhancements Nvidia manages to achieve because the NV3x architecture is one year old.

And, the 5700 doesn't perform that bad in either Halo or TR AoD (same can't be said for the 5600 though, although it's ok in Halo).
Although neither of those are DX9 games imo.
 
WaltC said:
It certainly appears to me, though, that the ps2.0 shader performance improvement in the 5700 has much less to do with compiler tuning and much more to do with ripping out the integer units in the chip, if that is indeed the case.

It doesn't appear that way to me, since there are also large improvements on the 5600/5900 using Det45 vs Det 52. What's more, these performance improvements are across the board, not just benchmarks, but on a whole boatload of games (Halo, FFXI, etc).

I'm waiting for a comprehensive set of comparisons (maybe Dave can run the tests with Det45 vs Det52), but it looks to me like it is legitimate. And it makes sense if you read their Unified Compiler Whitepaper where they show before and after instruction schedules.


I don't think anyone would suggest that improving compilers isn't a worthwhile endeavor, and had we discussed this two weeks after the launch of nV30 I'd have agreed there's a lot of room for improvement.

Well, I have been discussing it since the NV3x launch, like a broken a record, I might add, yet after many many shader-benchmark tests people were coming to the conclusion that there was nothing more to be done because of the NV3x architecture. (and of course, with the usual disclaim that a technical discussion on NV3x architecture and optimization opportunities does not constitute a "defense" of NVidia or anti-ATI position for fanboys)


The problem with nVidia's approach to compiler optimization is that it has been entirely lopsided, held out as a panacea that will produce the desired results "given time," and more or less used as a marketing ploy to try and explain away very large performance deficits of nV3x relative to its competition, deficits pertinent to advanced API functionality and much less dependent on traditional factors like bandwidth, TMU's, etc.

I've never see anyone hold out compiler optimization as an explaination for deficits missing from the API. But since optimization did produce good results at last, I'd say the folks championing it (Nvidia) were correct.

Don't try to turn this into an ATI vs NVidia thread, it's not. Even if NV3x shaders ran faster than ATIs, it doesn't make up for the AA, gamma, MRT, and other missing, or inadequate features.

I'm not interested in the Det52's from the aspect of "well, I might want to buy a 5700 now". I am interested in them from the aspect of "What went wrong?"

It appears NVidia's problems are the result of trying to design a shader pipeline to be too flexible (resource sharing, extremely long lengths, predicates, ddx/ddy, unlimited dependent textures, complex instruction set, pure stencil-fill mode, two TMUs, etc) They spent transistors on complexity, which in itself, led to poorer performance, but in doing so, the added complexity made it much harder for the drivers to translate DX9 instructions efficiently. Both these factors led to really crummy performance, now it appears the latter issue has been resolved, but we're still left with HW that is not up to snuff.


To illustrate the point consider how little ATi has talked about compiler optimization in the last year, and yet it's certain they're in no less need of optimized compilers than is nVidia or anybody else.

Not really. ATI's architecture seems much more straight forward and tailor made for DX9 input. You don't have register limitations to deal with. You don't have multi-precision. You have clear rules for how to use the separate vector and scalar units. They still have to do translation and scheduling, but the issues aren't as complex. If you listen to Richard Huddy explain how to hand craft shaders, you'll see that it's much simpler. Hand crafting for NV3x is more difficult.



This all seems very lopsided to me and it certainly appears as if you might be expecting compilers to work miracles. I think "decent improvement" is a reasonable expectation, but I also think that each successive attempt at squeezing performance out of optimization will be a matter of greatly diminishing returns.

That depends on what the performance bottlenecks are and what the driver is and is not doing currently. According to the Unified Compiler paper, they weren't doing much "pairing up" of instructions at all in previous Det's, which leaves functional units sitting idle. Moreover, they weren't re-allocating registers to balance them against other bottlenecks, which also leads to very bad results.

Until David Kirk explains the NV3x architecture in full to us, we have no clue what other hidden bottlenecks are apparent. Expert miracles? No. Possible hefty improvements (10-20%?) It's possible.


I'm looking forward to the end of the nV3x story, myself...:) My sincerest prayer is that it is not merely continued with nV4x.

Well, if they can make their design work in conjuction with compilers on the NV4x, more power to them. Their overall approach to the NV3x shader pipeline is not neccessarily wrong, since PS3.0 demands more flexible pipelines anyway. They need to fix the issues they currently have (allow more simultaneous live registers, add more full FP32 units, etc)


The difference in this case is that nVidia's using the concept of compilers (old as the hills and twice as dusty) as a PR tool to try and frame the issue of its performance deficit in such a way as to have it appear less critical than it actually is. I'll bet you that if it was ATi behind, instead of nVidia, that you'd have heard scarcely a peep about compilers out of nVidia all year long.

People on this BBS were predicting issues with compilers with regard to NV3x way before NVidia started talking about it. I don't even recall a year ago NVidia even mentioning that compilers will fix their PS2.0 problems. Almost all NVidia statements were saying "use ps1.4 instead, use lower precisions, use CG,etc" They weren't saying "just you wait, we are going to deliver an optimizing compiler that will give a huge boost to PS2.0"

You are using the label PR as if it means "untrue" or "phony". There are plenty of bad bad nvidia PR statements, but this time, with respect to compilers, they are absolutely right.


The crux of the matter is that it is not for lack of a good compiler that nV3x suffers in comparison to R3x0, but to many more things which are more important and fundamental than the compiler but which nVidia can do nothing about at the present time. The compiler is what they can change presently and so that is what they talk about.

Well, the crux of the matter is, they were suffering for 2 reasons, and one half of the suffering has been eliminated. Software was a huge issue for the Nv3x because of it's architecture and it is likely to be a large issue for all PS3.0 cards.


As far as I know, that has always been the case with 3d chip development. Nothing new to see here in that general regard.

It's far worse now due to the flexibility of modern cards. Not only do they have to make the fixed functions run fast, but they now have shaders which can utilize card resources in any order, rather than the fixed order that the old state based pipeline implied.

(sheesh, my posts are getting like demalion now. Help, it's infectious!)
 
Bjorn said:
T2k said:
Disagree? That's my point! They need drivers from the neginning, at PRESENT, not in the future!

That wasn't your point. Your point was that we should ignore all performance enhancements Nvidia manages to achieve because the NV3x architecture is one year old.

No. That's your misinterpretation. Same like the following one:

And, the 5700 doesn't perform that bad in either Halo or TR AoD (same can't be said for the 5600 though, although it's ok in Halo).
Although neither of those are DX9 games imo.

Aren't they? Interesting.
So, enlight me: what makes a game DX9 'qualified' if presence of PS2.0 does not?
:rolleyes:
 
DemoCoder said:
Until David Kirk explains the NV3x architecture in full to us, we have no clue what other hidden bottlenecks are apparent.

C'mon, DC, you know very well as much as I do know: henever gonna do it.

DemoCoder said:
Expert miracles? No. Possible hefty improvements (10-20%?) It's possible.

Do you think it's hefty when ATi has sometimes 80-100% advantage?
 
T2k said:
C'mon, DC, you know very well as much as I do know: henever gonna do it.
Well, there is a rumor they are going to do it soon. Since it helps developers, and since NV3x is almost at the end of its lifespan, why not?

DemoCoder said:
Expert miracles? No. Possible hefty improvements (10-20%?) It's possible.

Do you think it's hefty when ATi has sometimes 80-100% advantage?[/quote]

Yes, a 20% performance improvement is "hefty". Why are you trying to turn an academic thread on the nature of optimizing shaders into another ATI vs NVidia argument? I've already told you that this isn't about advocating NVidia cards, since I've already told you that the NV3x lacks too many other features.

This thread is about the nature of optimizing shaders and potentially future "complex" PS3.0 architectures, and "what went wrong" when they were designing the NV3x.

The relevance is that perhaps now they have learned a valuable lesson, gained valuable experience for PS3.0. Some of their experience in writing an optimizer for the NV3x will transfer over when they start to address a PS3.0 capable design.

Stop turning every discussion into a referendum on which video card is the most valuable to purchase.
 
DemoCoder said:
T2k said:
C'mon, DC, you know very well as much as I do know: henever gonna do it.
Well, there is a rumor they are going to do it soon. Since it helps developers, and since NV3x is almost at the end of its lifespan, why not?

OK, let's hope it.
:rolleyes:

DemoCoder said:
Expert miracles? No. Possible hefty improvements (10-20%?) It's possible.

Do you think it's hefty when ATi has sometimes 80-100% advantage?[/quote]

Yes, a 20% performance improvement is "hefty". Why are you trying to turn an academic thread on the nature of optimizing shaders into another ATI vs NVidia argument?
[/quote]

? :oops: Did you sleep enough? Who the hell wants to do that? I'v pointed out why ridiculous using 'hefty' in this subject.

BTW, 50-70% cold be HEFTY, dear DC.

I've already told you that this isn't about advocating NVidia cards, since I've already told you that the NV3x lacks too many other features.

Did you check my nick? Are you wirting for a right person? :oops:
:?:
I never said you advocate or any similar... ehh.

This thread is about the nature of optimizing shaders and potentially future "complex" PS3.0 architectures, and "what went wrong" when they were designing the NV3x.

The relevance is that perhaps now they have learned a valuable lesson, gained valuable experience for PS3.0. Some of their experience in writing an optimizer for the NV3x will transfer over when they start to address a PS3.0 capable design.

Time will tell but I doubt it.

Stop turning every discussion into a referendum on which video card is the most valuable to purchase.

Pretty arrogant and absolutely NOT true on top of it, my friend.
Stop smoking that crap that makes you to hallucinate things I never said, never did.

You've used that 'hefty' which is 100% RIDICULOUS in this case, I think.
Even you think you'ra soooo academic :rolleyes: :rolleyes: - it's still rather just blind and funny.
That's all, my friend.
[/console off]
 
Ah, thanks, I knew there was a reason why I wasn't coming here more often... The signal/noise ratio...

But thanks to everyone having posted feedback about the fp targets! :)
 
A 20% increase is a large improvement (and I pulled this number outta my ass, it looks like it is much larger). If I had a 20% return on my stock portfolio, I'd be exhuberant. If my car were tuned up and gained 20% more performance, I'd think it was pretty remarkable. If I had a C compiler which could make my 2Ghz CPU run like a 2.4Ghz CPU, I'd think it was a pretty good deal.

It's really absurd to get into some bs semantic discussion over the true definition of "hefty". What would you call the Det52 improvements? Miniscule?


As to whether Nvidia has learned their lesson with regard to the need for good compilers and the need to fix their NV3x architecture, you think they haven't? So in your mind, when they release the NV4x, it will be a shallow rehash of NV3x pixel shader pipeline, same problems, and they will ship a non-optimizing driver for a whole year, to boot?
 
DemoCoder said:
A 20% increase is a large improvement (and I pulled this number outta my ass, it looks like it is much larger). If I had a 20% return on my stock portfolio, I'd be exhuberant. If my car were tuned up and gained 20% more performance, I'd think it was pretty remarkable. If I had a C compiler which could make my 2Ghz CPU run like a 2.4Ghz CPU, I'd think it was a pretty good deal.

Aha. And wth your compiler or your portfolio has to do here? :rolleyes:
FYI: I said, in this case, on this subject.

It's really absurd to get into some bs semantic discussion over the true definition of "hefty". What would you call the Det52 improvements? Miniscule?

You know, sometimes people using quotes from each other. After then it becomes some sort of 'truth' 'cause XY said over at B3D.
If you consider yourself as a professional (as you are, I think), then probably better to do some semantic check before posting.
Since I'm obviously 'less pro' compared to you, I shouldn't have to do that - and we did vica versa.

That's all, I'm cool.

As to whether Nvidia has learned their lesson with regard to the need for good compilers and the need to fix their NV3x architecture, you think they haven't?

No, I never said that. I think it does make sense for all the NEXT generation, the NV4x-line and up but nothing for the NV3x, a year after its release.

So in your mind, when they release the NV4x, it will be a shallow rehash of NV3x pixel shader pipeline, same problems, and they will ship a non-optimizing driver for a whole year, to boot?

See my last words... ;)

PS: Peace, DC. You drawn your sword for... nothing. :)

EDIT: bad grammar
 
T2k said:
As to whether Nvidia has learned their lesson with regard to the need for good compilers and the need to fix their NV3x architecture, you think they haven't?

No, I never said that. I think it does make sense for all the NEXT generation, the NV4x-line and up but nothing for the NV3x, a year after its release.

No one's putting words in your mouth. I said perhaps they learned a valuable lesson in working the bugs out of the NV3x and driver which might help future products. You said "I doubt it"
 
DemoCoder said:
T2k said:
As to whether Nvidia has learned their lesson with regard to the need for good compilers and the need to fix their NV3x architecture, you think they haven't?

No, I never said that. I think it does make sense for all the NEXT generation, the NV4x-line and up but nothing for the NV3x, a year after its release.

No one's putting words in your mouth. I said perhaps they learned a valuable lesson in working the bugs out of the NV3x and driver which might help future products. You said "I doubt it"

Actually you wrote PS3.0 compiler and I (mis)interpreted for the NV3x-line, due to its architectural nature. Sorry about that.
 
20% is a hefty gain. I'd love going from 50 to 60fps with merely a driver update.
 
Back
Top