David Kirk finally admits weak DX9 vs DX8 performance - GeFX

KimB · Oct 25, 2003

Dio said:
So at Editor's Day, 24-bit float isn't good enough, but for Dr. Kirk, more than 16-bit is overkill???

The claim is that 24-bit FP isn't enough for many non-color operations (texture addressing, for one), but is overkill for many color operations.

sireric · Oct 25, 2003

Chalnoth said:
Dio said:

So at Editor's Day, 24-bit float isn't good enough, but for Dr. Kirk, more than 16-bit is overkill???

Click to expand...

The claim is that 24-bit FP isn't enough for many non-color operations (texture addressing, for one), but is overkill for many color operations.

And that's misleading too. FP24 is fine for all texture addressing (it's more than enough for 2kx2k, which really only requires about 15b of mantissa). Where FP24 might not be enough is if you write unstable shader code (i.e. if error amplification occurs). But, if you write such code, FP32 is not enough either. All it buys you is a few more instructions. I've yet to see any real shaders that show problems in FP24. Yes, I could write one, but I could write one that kills FP32 too. We've converted over shaders that are thousands of lines long, from Renderman (TM), and all of those are beautiful. Given that so far we haven't hit anything that overpowers FP24, I think it's safe to say that MS and ATI made a good decision.

Arun · Oct 25, 2003

sireric said:
Given that so far we haven't hit anything that overpowers FP24, I think it's safe to say that MS and ATI made a good decision.

I fully agree. The FP24 decision most certainly was an excellent one - I'd personally prefer a FP24/FP48 mix with very fast FP24 and very slow FP48 than a FP16/FP32 mix as NVIDIA's doing.

I think the ability to do FP48, even in many cycles, maybe as many as 4-5, would be an excellent technical and marketing advantage: workstations can't complain about NV having some type of advantage, and marketing can claim "my **** is bigger than yours".

FP32 and FP64 or FP16/FP32/FP64 would be overkill IMO though.

Uttar

VFX_Veteran · Oct 25, 2003

sireric said:
Chalnoth said:

Dio said:

So at Editor's Day, 24-bit float isn't good enough, but for Dr. Kirk, more than 16-bit is overkill???

Click to expand...

The claim is that 24-bit FP isn't enough for many non-color operations (texture addressing, for one), but is overkill for many color operations.

Click to expand...

And that's misleading too. FP24 is fine for all texture addressing (it's more than enough for 2kx2k, which really only requires about 15b of mantissa). Where FP24 might not be enough is if you write unstable shader code (i.e. if error amplification occurs). But, if you write such code, FP32 is not enough either. All it buys you is a few more instructions. I've yet to see any real shaders that show problems in FP24. Yes, I could write one, but I could write one that kills FP32 too. We've converted over shaders that are thousands of lines long, from Renderman (TM), and all of those are beautiful. Given that so far we haven't hit anything that overpowers FP24, I think it's safe to say that MS and ATI made a good decision.

Question for you ATI guys:

What kind of data type is 24-bits though (i.e. how do you declare a 24-bit long variable)? Wouldn't it be more optimal if you stored things in full 32 simply because of the architecture of the CPU/OS? Or since we are dealing with the 3d hardware, all 'float' types get converted down to 24-bits?

-M

[maven] · Oct 25, 2003

I'm definitely not one of the ATI guys, but the idea generally is that the CPU never has to touch those values directly, so it does not matter...
Unless you're saving a float image to disc or some such thing.

sireric · Oct 25, 2003

Mr. Blue said:
Question for you ATI guys:

What kind of data type is 24-bits though (i.e. how do you declare a 24-bit long variable)? Wouldn't it be more optimal if you stored things in full 32 simply because of the architecture of the CPU/OS? Or since we are dealing with the 3d hardware, all 'float' types get converted down to 24-bits?

-M

The CPU/OS has nothing to do with the shaders. You write your shaders, declaring temps and interpolants -- You just declare them as you would declare any variable in any language. There's no specification as to the width anywhere (except the _pp option). I don't really understand your question -- There's no conversion from 32b to 24b for "float" types. For that matter, there's no "float" type.

VFX_Veteran · Oct 25, 2003

sireric said:
The CPU/OS has nothing to do with the shaders. You write your shaders, declaring temps and interpolants -- You just declare them as you would declare any variable in any language. There's no specification as to the width anywhere (except the _pp option). I don't really understand your question -- There's no conversion from 32b to 24b for "float" types. For that matter, there's no "float" type.

When I'm writing a shader, I declare variables as I normally would do in C. However, these variables get stored in registers, correct? If that's the case, where does the 24fp come into play?

-M

KimB · Oct 25, 2003

Well, whenever writing to its own video memory, the R3xx always does write 32-bit floats (when a high precision FP target is selected).

KimB · Oct 25, 2003

Anyway, the simple question that needs to be answered on whether or not FP24 support was really a good idea is how transistor-efficient is it? We do know that a simple multiplier will take up about twice as much space in FP32 as an FP24 multiplier, and most everything else will scale linearly with the increase in precision, but what does that mean in terms of absolute transistor space? What percentage of the entire core is used by the multipliers?

We just don't have this information now, and there are far, far too many differences between the NV3x and R3xx to single out the choice for FP24 vs. FP32 as the reason for the performance disparity.

WaltC · Oct 25, 2003

Re: David Kirk finally admits weak DX9 vs DX8 performance -

g__day said:
http://www.guru3d.com/article/article/91/2

"Through a great talk given by Chief Technology Scientist, David Kirk, NVIDIA basically claims that if 16-bit precision is good enough for Pixar and Industrial Light and Magic, for use in their movie effects, it's good enough for NVIDIA. There's not much use for 32-bit precision PS 2.0 at this time due to limitations in fabrication and cost, and most notably, games don't require it all that often. The design trade-off is that they made the GeForce FX optimized for mostly FP16. It can do FP32, when needed, but it won't perform very well. Mr. Kirk showed a slide illustrating that with the GeForce FX architecture, its DX9 components have roughly half the processing power as its DX8 components (16Gflps as opposed to 32Gflops, respectively). I know I'm simplifying, but he did point it out, very carefully. "

* * *

Finally they state publically what the best sites have figured out over 8 months ago - only now its them finally stating this publically. I applaud them coming clean - even though David Kirk didn't reflect on how they launched NV FX series as the Dawn of 32 bit processing because 16 bit wasn't enough - and now they're publically saying it is not realistically available and no one does it...

except ATi can do it (fp32) on 0.15 mircon technology for 9 out of 11 steps in their graphics pipeline Dave has informed us .

I agree that it is time nVidia begins to recognize the longer-term benefits of making veracity a policy within the company's PR department. It's quite a luxury in PR to never have to worry about when the next thing you might say will contradict the last thing you said, and not having to keep a written record of the yarns you've spun so as to keep your stories straight. Honesty can indeed be a paying policy for the company smart enough to recognize its value. If you are both honest and smart, there is a beneficial way to handle everything. Unfortunately, that combination of virtues seems rather rare at nVidia, where most of its spokesmen seem to have one but lack the other.

I would like to see them do better than a 50-50 mix of veracity and fiction, however...

While I agree that this certainly seems like progress, I'm going to wait and see if this develops into a trend, or whether even this much veracity is merely an anomaly tailored to a specific audience which Kirk felt was 50% less likely to be fooled than other of his audiences.

Quite truthfully, the most honest and straightforward remarks I have ever seen nVidia make publicly are the remarks they make in sit-down meetings with financial analysts representing large institutional brokerages. I recall last year when nVidia was telling the Internet press nV30 would ship in Jan or Feb, they told Merrill Lynch analysts at the same time that it would be April or May, most likely. The legal penalty for misrepresentation to Internet technology journalists is, of course, not the same as it is for misrepresenting critical information to the financial community...

To that end, the two things I remember most from nVidia's '02 nV30 launch are as follows:

(1) 96-bits of color precision in the fp pipeline is "not enough" for professional work

(2) It isn't time yet for a 256-bit wide memory bus, and nVidia will decide when the time arrives (meanwhile, I suppose, he was suggesting that we should maybe simply ignore ATi's because it was early?)

Anyway, you can see how his nV30 statement directly contradicts what he is paraphrased as saying here relating to nV3x fp16 and his imagined scenario in which ILM rendering artists are using GFFX 5950's running the Forcenators ("You may not pick it, but we force it on you anyway") at fp16 to generate the upcoming Terminator 4 special effects.

Oh, he wasn't actually implying that ILM used nVidia hardware and software to do its professional rendering work? Hmm...I wonder why. Might it be because the difference between the hardware and software used by ILM in its work in terms of function and price is so far removed from anything nVidia sells that any such comparison would be ridiculous?

And gosh, I wonder why fp32 was the gold standard for professional rendering work "used throughout Hollywood" last year at the "Dawn of Cinematic Computing" debut of nv30, according to Kirk and everybody else at nVidia during the debut and after--long after. What's happened? Wow, now Kirk's letting us all know that *everything has changed* and suddenly fp32 is really "too much" and so is fp24 and even ILM knows that nVidia was right along about what it *never* said last year about fp16.

Heh...

I cannot believe the man can say the things he says with a straight face. And I got all of this out of a single paragraph! But, he sure makes life colorful....

Arun · Oct 25, 2003

Re: David Kirk finally admits weak DX9 vs DX8 performance -

WaltC said:
Heh... I cannot believe the man can say the things he says with a straight face. And I got all of this out of a single paragraph! But, he sure makes life colorful....

I got a better quote from the same good ole David Kirk.
Dated pre-launch NV30, originally shown at the following url:
http://www.nvidia.com/content/areyouready/story.html

All of the page is full of marketing BS, but the following quote is particularly telling:

"[...] I don't think I have to even say it, but of course, this new GPU from NVIDIA will be the fastest GPU available at any price," he says. How much faster? "Hmmm, I won't let any numbers drop yet, but let me just say that I feel that no one will be disappointed."

The man's a genius, trust me on that one

Uttar

P.S.: ULE has "taped-out" - okay, so I know it isn't the best term for an editorial, but expect to have it online in the next 4-5 days, maximum, if everything goes as expected. Some more rereading and editing is still required.

nonamer · Oct 25, 2003

Re: David Kirk finally admits weak DX9 vs DX8 performance -

Uttar said:
P.S.: ULE has "taped-out" - okay, so I know it isn't the best term for an editorial, but expect to have it online in the next 4-5 days, maximum, if everything goes as expected. Some more rereading and editing is still required.

So what's it's gonna look like? Any sample texts? What's it's basic layout?

Arun · Oct 25, 2003

Re: David Kirk finally admits weak DX9 vs DX8 performance -

nonamer said:
Uttar said:

P.S.: ULE has "taped-out" - okay, so I know it isn't the best term for an editorial, but expect to have it online in the next 4-5 days, maximum, if everything goes as expected. Some more rereading and editing is still required.

Click to expand...

So what's it's gonna look like? Any sample texts? What's it's basic layout?

25000 characters, excluding spaces.
5000 words.
10.5 pages.
8 parts/paragraphs/chapters/whatever.

As for the content, I'll just say it's NOT technical.

Uttar

Dave Baumann · Oct 25, 2003

5000 words.

Phhht! My 5700 preview is already twice that!!

nelg · Oct 25, 2003

DaveBaumann said:
5000 words.

Click to expand...

Phhht! My 5700 preview is already twice that!!

>10,000 words :!:

Does that mean we can expect the unexpected :?:

Arun · Oct 25, 2003

DaveBaumann said:
5000 words.

Click to expand...

Phhht! My 5700 preview is already twice that!!

Yeah I know, its no HUGE thing. But hopefully people will learn more in it than in your FX5700 so called "preview"

Actually, I'm not too sure about that, but eh

I could have written about a LOT of other stuff, that I had actually considered to talk about in it. But two things prevented me to do so:
1. It would have made my main points sound minor, while they are really hundreds of times more important than the rest IMO.
2. It would have reduced overall quality, made editing take even longer, and resulted in a crappy thing getting out in a month or so.

Editing slowed down the process a lot, as Dilbert should have taught you all. But then again, it was really worth it!

Uttar

Doomtrooper · Oct 25, 2003

Tim Sweeney, Epic Games
For games shipping in 2003-2004, the multiple precision modes make some sense. It was a stopgap to allow apps to scale from DirectX8 to DirectX9 dealing with precision limits.

Long-term (looking out 12+ months), everything's got to be 32-bit IEEE floating point. With the third generation Unreal technology, we expect to require 32-bit IEEE everywhere, and any hardware that doesn't support that will either suffer major quality loss or won't work at all.

Tim and the good Doctor should get on the same page

http://www.beyond3d.com/interviews/ps_precision/

John Reynolds · Oct 25, 2003

Re: David Kirk finally admits weak DX9 vs DX8 performance -

Uttar said:
I got a better quote from the same good ole David Kirk.
Dated pre-launch NV30, originally shown at the following url:
http://www.nvidia.com/content/areyouready/story.html

All of the page is full of marketing BS, but the following quote is particularly telling:

"[...] I don't think I have to even say it, but of course, this new GPU from NVIDIA will be the fastest GPU available at any price," he says. How much faster? "Hmmm, I won't let any numbers drop yet, but let me just say that I feel that no one will be disappointed."

Click to expand...

The man's a genius, trust me on that one

Uttar

This is particularly amusing in light of the recent developer event Nvidia held:

Some folks in the industry ask: "Is the games industry ready for this level of advancement?" Kirk says they are. "The movie industry has taught us that 128-bit precision and incredible levels of programmability are absolutely necessary for cinematic-quality special effects. Half measures like 96-bit precision are inadequate for many critical functions, such as texture addressing, geometric calculations like reflection and shadows, and so on," says Kirk.

MfA · Oct 26, 2003

You guys make it sound like Microsoft settled on 24 bits before any line of HDL was written, I have a hard time believing that ... want to mention some timeframes when ATI "and" Microsoft settled on 24 bit?

Pete · Oct 26, 2003

nelg said:
DaveBaumann said:

5000 words.

Click to expand...

Phhht! My 5700 preview is already twice that!!

Click to expand...

Oh, come on now, Dave. Game pics rendered using ATi's ASCII shader don't count as part of the article text.

David Kirk finally admits weak DX9 vs DX8 performance - GeFX

KimB

sireric

Arun

Unknown.

VFX_Veteran

[maven]

sireric

VFX_Veteran

KimB

KimB

WaltC

Arun

Unknown.

nonamer

Arun

Unknown.

Dave Baumann

Gamerscore Wh...

nelg

Arun

Unknown.

Doomtrooper

John Reynolds

Ecce homo

MfA

Pete

Moderate Nuisance

Similar threads