Deathlike2
Regular
Bob said:You may want to initialize 'b', in case its previous value was inf or nan.
Also, have you tried sequences like this on NV4x/G7x? You might be surprised...
Care to elaborate on it? Inquiring minds want to know.
Bob said:You may want to initialize 'b', in case its previous value was inf or nan.
Also, have you tried sequences like this on NV4x/G7x? You might be surprised...
True, if these values are supported at all.Bob said:You may want to initialize 'b', in case its previous value was inf or nan.
I would expect it to use predication, so there's neither performance gain nor loss. (edit: a predicated texture read could actually result in substantial savings).Also, have you tried sequences like this on NV4x/G7x? You might be surprised...
Under true IEEE754, Zero*Inf = NaN and Zero*NaN = NaN. Yes, you can get a result that is not Zero by multiplying Zero by some of the special numbers.Deathlike2 said:Care to elaborate on it? Inquiring minds want to know.
arjan de lumens said:Under true IEEE754, Zero*Inf = NaN and Zero*NaN = NaN. Yes, you can get a result that is not Zero by multiplying Zero by some of the special numbers.
Well, if a is often zero in continuous areas of pixels, then NV40/G70 would benefit, too.Pete said:I thought Bob works for NV, in which case I'd say he means NV40/G70 would benefit from such a conditional, too.
Sweet! There's a conditional in my reply, too.
Ailuros said:Was NV4x a quantum leap in performance over NV3x because of added units and/or SM3.0 support or because the latter didn't inherit the first's weaknesses? If I draw a parallel between R2xx and R3xx I don't think the answer is that much different in the end.
Can the compiler, while compiling "DumbMul", find out the cycle-cost of incrediblyComplexFunc ?Xmas said:[...] Although you could argue that if incrediblyComplexFunc takes 50+ cycles, 4 or so cycles for if/endif are actually not that significant any more.
It could be that NVidia did extensive empirical shader analysis and ended up with a threshold of complexity for a mul operand beyond which the probability of skipping the branch * the complexity of the branch is higher than the cost of the flow control instructions. Below that threshold, they use predication to at least save on texture bandwidth.
Or they could do shader replacement if P(a==0) is known to be high...
geo said:Umm, I dunno. There are weaknesses and weaknesses. Is a part "weak" because it is designed to be, or weak because it didn't work out so well? My understanding is that ATI made a business decision to go for leadership at the top end starting with R300. Shouldn't a part be judged on how well it meets --or doesn't-- the goals of its designers?
Ailuros said:***edit: in what consensus "leadership" anyway? ATI has been for ages the biggest player besides Intel (which I don't like encounting because calling an IGP a graphics accelerator is a bit of an oxymoron).
Dave Orton said:However, the big thing that I feel, and I came in with this conviction, was that watching us trying to target the mainstream and win was a dying model and so we said we are going to have to arc up. R300 was really the first part where we really opened up the thinking to what you can do to hit performance and hit schedule and relaxing on die size to an extent and I think it helped ATI get back in the game.
geo said:This is what I'm pointing at, a business decision to change their model from "competitive mainstream" to "big dog ready to tussle over any steak with the aim of winning":
And, no, I don't think NV planned for NV30 to be a disaster. I'm saying I don't think ATI planned for R200 to be the hottest cock on the walk. This is why the delta R200-->R300 isn't as indicative as your average NV generational delta, as NV has always wanted/planned to be Big Man on Campus.
Ailuros said:ATI did most definitely NOT design R200 to lose the battle either. Quite frankly one of those reasons I don't like these kind of theories is that way too many connect the change in success within ATI with the ArtX acquisition. I refuse to believe to any extend that ArtX were the absolute masters that came to rescue ATI's day. When there was a will there was a way and yes of course in all fairness the addition of more engineering talent and thus resources did help definitely the final outcome.
Ailuros said:I refuse to believe to any extend that ArtX were the absolute masters that came to rescue ATI's day.
Sure it can. At least the one in the driver, not the one from Microsoft, as it doesn't know cycle costs.Jawed said:Can the compiler, while compiling "DumbMul", find out the cycle-cost of incrediblyComplexFunc ?
I suspect not, as I imagine that shader compilation is always performed "piecemeal", where every function exists in a universe that consists solely of itself. Context-free compilation, as it were.
Well, I imagine there's lots of fun to be had in testing that. I wonder if Tridam has poked around...Xmas said:Sure it can. At least the one in the driver, not the one from Microsoft, as it doesn't know cycle costs.
SugarCoat said:except with the flipper, which was after all, ATI's introduction into consoles. Nintendo seems very satisfied with their partnership with ATI as does Microsoft.
The last desktop product lead by Marlborough (who's office name is "ATI Reasearch") was R200, and while they were the lead on Xenos,....