FP16 and market support

Bouncing Zabaglione Bros. · Dec 31, 2003

radar1200gs said:
As I said before, you should see what nVidia's FP32/FP16 is truly capable of with NV40.

(yawn) "Wait for the next drivers/revision/hardware - then you'll see how my dad will beat your dad". We've only been hearing that from the Nvidia fan boys for the last 18 months. I guess that's one thing that hasn't changed.

radar1200gs said:
Shame about how NV3x ended up, but in the end nVidia were correct to just get on with NV40 rather than waste resources and effort trying to fix the unfixable. In the end NV3x is extremely competitive in everything bar DX9 and DX9 isn't exactly setting the gaming world on fire at present.

WTF? They made a whole range of cards and sold them to consumers. They launched, and then cancelled NV30 after making a few thousand chips. How is that not "wasting resources"? What justification have Nvidia got for cheating consumers and conning them into buying an "unfixable" card?

radar1200gs said:
In the end NV3x is extremely competitive in everything bar DX9 and DX9 isn't exactly setting the gaming world on fire at present.

Rubbish. NV3x can't even match the previous generation of Nvidia cards, let alone the current competition of ATI cards.

As for DX9, that's advancing fast, and even in DX8, it takes a top of the range card to get good results. I'm sure the people who just spend $500 on a top of the range DX9 card from Nvidia will be pleased to hear that it's "unfixable" and will need to be thrown away in a few months to get reasonable performance in DX9.

Jeeez, if there's one thing I hate, it's crossing wits with those that are unarmed.

radar1200gs · Dec 31, 2003

jimbob0i0 said:
radar1200gs said:

If the rumors that R420/R423 will support a higher precision are true and _PP support comes along with that higher precision (which I believe it will, since for R300 to easily support _PP it would have had to be FP12 which is below the minimum microsoft specify) then I will LMAO.

Click to expand...

WTF????

Just where on earth have you pulled that one from?

The ATi guys have already said that if they go to FP32 it will be the same 'layout' at R300 - as in through the pipeline on eprecision not multiple.... heck they already said that *if* MS had said FP32 was the minimum precision in DX9 that the R300 could have handled that with 'relatively minor' alterations.... die size would have gone up and hence the yield per wafer would decrease causing cost of production to rise slightly... but they could have/would have done it.... multiple precisions never was in the picture.

OKAY - given a hypothetical situation however where R4xx is a multiple precision chip why oh why would you suggest that they would use FP12 anyway? Wouldn't FP24/32 make more sense....???? Or *even* FP16/32 like NV????

BESIDES R300 has bugger all to do with the future of multiple precisions anyway it was, is and will remain FP24 and there is no way to change that - it is IN THE HARDWARE! Any and all _pp hints in place are ignored and everything is run at full precision.

Excuse me while I fall off my chair laughing at you.

The simplest way to add _PP support short of actually adding dedicated registers is to split (or combine) your current registers. The simplest and most effective way to do that is to split them in half. ATi could probably have gotten FP16 out out their FP24 registers but the complexity would likely have mostly negated the benefits.

I never said r300 had anything to do with multile precisions - we all know it doesn't - I'm postulating the most likely reasons (complexitx, thus transistor budget) that it doesn't/didn't.

DemoCoder · Dec 31, 2003

I think the most likely multiprecision path of the future will be FX16/FP32, since PS3.0 adds integer registers, and OpenGL 2.0 specs integers in GLSLang, and since it makes sence for loops and other operations.

jimbob0i0 · Dec 31, 2003

radar1200gs said:
The simplest way to add _PP support short of actually adding dedicated registers is to split (or combine) your current registers. The simplest and most effective way to do that is to split them in half. ATi could probably have gotten FP16 out out their FP24 registers but the complexity would likely have mostly negated the benefits.

I never said r300 had anything to do with multile precisions - we all know it doesn't - I'm postulating the most likely reasons (complexitx, thus transistor budget) that it doesn't/didn't.

So what has that to do with R4xx and FP12 as *you* brought up? The ATi guys have previously clarified the reasons for a single precision FP24 path - I am sur eyou are capable of view old posts yourself and that they do not want to go over the same old ground again.

Summarily (correct me if I am wrong Dave) the original DX9 spec that was worked to was single precision - just undecided on FP24/32 for a while. Designs were worked out until talks with IHVs/MS/devs decided on FP24 as the minimum precision. Thus R300 became FP24. AFTER this fact NV revealed their multiple precision setup and *lobbied* for this to be supported. DX9.0b (with some bug fixes as well) along with the PS2_0_x targets introduced the _pp hint to support this.

I fail to see how your previous posts correspond with this and how you can suggest the multiple precision stuff you have been doing for the past 20-odd pages.

nAo · Dec 31, 2003

Like Sony did with the PS2 VU0/VU1 almost 4 years ago (FP32 on vec4 registers and FX16 on scalar registers)
Except texture fetches I believe PS2 VS are quite capable of VS3.0 features..

ciao,
Marco

radar1200gs · Dec 31, 2003

jimbob0i0 said:
radar1200gs said:

The simplest way to add _PP support short of actually adding dedicated registers is to split (or combine) your current registers. The simplest and most effective way to do that is to split them in half. ATi could probably have gotten FP16 out out their FP24 registers but the complexity would likely have mostly negated the benefits.

I never said r300 had anything to do with multile precisions - we all know it doesn't - I'm postulating the most likely reasons (complexitx, thus transistor budget) that it doesn't/didn't.

Click to expand...

So what has that to do with R4xx and FP12 as *you* brought up? The ATi guys have previously clarified the reasons for a single precision FP24 path - I am sur eyou are capable of view old posts yourself and that they do not want to go over the same old ground again.

Summarily (correct me if I am wrong Dave) the original DX9 spec that was worked to was single precision - just undecided on FP24/32 for a while. Designs were worked out until talks with IHVs/MS/devs decided on FP24 as the minimum precision. Thus R300 became FP24. AFTER this fact NV revealed their multiple precision setup and *lobbied* for this to be supported. DX9.0b (with some bug fixes as well) along with the PS2_0_x targets introduced the _pp hint to support this.

I fail to see how your previous posts correspond with this and how you can suggest the multiple precision stuff you have been doing for the past 20-odd pages.

Are you so thick that you need a picture drawn?

R300's pixel shader registers are 24 bits wide. 24/2 = 12

As I said above, they could attempt to get 16 bits out, but the complexity would not be worth it.

The minimum for partial precision is FP16, therefore FP12 plainly is inadequate.

KimB · Dec 31, 2003

nAo said:
Like Sony did with the PS2 VU0/VU1 almost 4 years ago (FP32 on vec4 registers and FX16 on scalar registers)
Except texture fetches I believe PS2 VS are quite capable of VS3.0 features..

PS2 doesn't have vertex shaders. It has a vector processor. Big difference.

Dio · Dec 31, 2003

Chalnoth said:
Dio said:

So the point is, it's not just 'a few simple rules'.

Click to expand...

It is. You just have to be aware of "creeping errors," and know how to spot them when they occur. I don't think this will be a problem for the majority of shaders.

OK, so I'm confused. First you said it was simple, so I said if it's simple it can be done automatically. Then you said it couldn't be done automatically, because it's hard to do, so I said that therefore it's not just simple. But now it's simple again. So why can't it be done automatically?

KimB · Dec 31, 2003

I said it was dangerous to do it automatically, and I meant it was dangerous to do it automatically all the time.

Basically, different precisions should be able to be selected properly and with no noticeable visual quality loss 99.99% of the time. But there are always corner cases that the developer would need to look out for more and more as shaders get longer.

DemoCoder · Dec 31, 2003

jimbob0i0 said:
Summarily (correct me if I am wrong Dave) the original DX9 spec that was worked to was single precision - just undecided on FP24/32 for a while. Designs were worked out until talks with IHVs/MS/devs decided on FP24 as the minimum precision. Thus R300 became FP24.

Still can't let go of that fairy tale that IHVs finalize their HW only after the spec is worked out, eh? I guess ATI won't even write one line of code for beyond 3.0+ features until Microsoft hands down the DirectX Next spec and blesses it?

AFTER this fact NV revealed their multiple precision setup and *lobbied* for this to be supported. DX9.0b (with some bug fixes as well) along with the PS2_0_x targets introduced the _pp hint to support this.

It is standard practice for members of working groups or standards consortium lobby to have their tech included in the standard. If the standards bodies or working groups didn't accept submissions once in a while, there would be no overriding reason for members to waste their time at meetings.

jimbob0i0 · Dec 31, 2003

radar1200gs said:
Are you so thick that you need a picture drawn?

R300's pixel shader registers are 24 bits wide. 24/2 = 12

As I said above, they could attempt to get 16 bits out, but the complexity would not be worth it.

The minimum for partial precision is FP16, therefore FP12 plainly is inadequate.

Given my background I would sincerely hope I was not as thick as you suggest....

Perhaps I have misunderstood your position and/or you have misunderstood mine....

I refer to you claim that :

If the rumors that R420/R423 will support a higher precision are true and _PP support comes along with that higher precision (which I believe it will, since for R300 to easily support _PP it would have had to be FP12 which is below the minimum microsoft specify) then I will LMAO. It will be hugely entertaining to see the fanboys explain how it all came about given FP24's superiority.

As I said before, you should see what nVidia's FP32/FP16 is truly capable of with NV40. Shame about how NV3x ended up, but in the end nVidia were correct to just get on with NV40 rather than waste resources and effort trying to fix the unfixable. In the end NV3x is extremely competitive in everything bar DX9 and DX9 isn't exactly setting the gaming world on fire at present.

on the previous page. Why did you bring up R4xx and multiple precision? Why did you mention R300 in the same part? Why did you mention FP12 in relation to these bits when you have admitted that it has nowt to do with anything since?

If you can make this clearer I will attempt to see things from your position. If you cannot then DaveB's comment subsequent to the post quoted above can only be thought of as *appropriate*

jimbob0i0 · Dec 31, 2003

DemoCoder said:
jimbob0i0 said:

Summarily (correct me if I am wrong Dave) the original DX9 spec that was worked to was single precision - just undecided on FP24/32 for a while. Designs were worked out until talks with IHVs/MS/devs decided on FP24 as the minimum precision. Thus R300 became FP24.

Click to expand...

Still can't let go of that fairy tale that IHVs finalize their HW only after the spec is worked out, eh? I guess ATI won't even write one line of code for beyond 3.0+ features until Microsoft hands down the DirectX Next spec and blesses it?

AFTER this fact NV revealed their multiple precision setup and *lobbied* for this to be supported. DX9.0b (with some bug fixes as well) along with the PS2_0_x targets introduced the _pp hint to support this.

Click to expand...

It is standard practice for members of working groups or standards consortium lobby to have their tech included in the standard. If the standards bodies or working groups didn't accept submissions once in a while, there would be no overriding reason for members to waste their time at meetings.

Hey I never said that - you are putting words in my uh post 8)

The hardware AFAIK wasn't fully finalised and sent to the foundry for full production until signed off - thus it was possible tilll relatively late in the game to switch to FP32 instead of FP24 if that was really required - OpenGL guy, Dio - was that what your previous posts indictated?

As for 3.0+ I would expect that ATi R&D was talking to and working with MS and IHVs at the moment with determining how to move forward into DirectX NEXT and for *at least* the same level of input that went into DX9 goes into DX10.

As for the lobbying - it is important that all parties can contribute their thoughts to a standard, as all IHVs do to the DirectX standard, but once that standard is finalised is it too much to think that the parties will then follow the standard? As it was the standard was finalised and then _pp support added *after* the fact rather than during the usual standard lobbying/discussion process whilst creating the standard.

nAo · Dec 31, 2003

Chalnoth said:
PS2 doesn't have vertex shaders. It has a vector processor. Big difference.

Well..it does any thing a VS2.0 can do and much more. (and it's blazing fast too)
If I would like to port a standard directx9 Vertex shader on the PS2 I could do it writing some macros. It wouldn't be otptimal but it'll work.
There would be concerns only on the program lenght (1024 vector + 1024 scalar instructions).
So if one would ask me if PS2 supports VS I'd answer no: it's better than that

ciao,
Marco

Demirug · Dec 31, 2003

jimbob0i0 said:
Summarily (correct me if I am wrong Dave) the original DX9 spec that was worked to was single precision - just undecided on FP24/32 for a while. Designs were worked out until talks with IHVs/MS/devs decided on FP24 as the minimum precision. Thus R300 became FP24. AFTER this fact NV revealed their multiple precision setup and *lobbied* for this to be supported. DX9.0b (with some bug fixes as well) along with the PS2_0_x targets introduced the _pp hint to support this.

2.x and the _pp hint were in the original DX9 Spec. DX9.0b only contains bugfixes and a set of new MDX-Assemblies. If MS add something to the spec they change the number of the DX-Version. If only the charater is changed it is only a new bugfixed runtime.

sonix666 · Dec 31, 2003

martrox said:
Jeezâ€¦..isnâ€™t anyone else getting tired of this crap? The ONLY reason that there is a FP16 is because one major player hasnâ€™t got the ability to run in FP24, and while it can run FP32, it hasnâ€™t the power to do so, PERIOD! M$ had no choice but to allow FP16 as a hint in order to keep this one manufacturer â€œin the gameâ€, so to speak. The reasons for the manufacturerâ€™s choice, on the other hand, were to try to keep others â€œout of the gameâ€, so to speak.

If all DX9 video cards were capable of running FP24 at decent speed, we wouldnâ€™t be having this discussion. If nVidia had followed M$â€™s DX9 specifications, rather than trying to do an end run around them, there would be no FP16. FP16 will only last as long as nVidia supports what is a badly conceived and executed family of video cards.

And no amount of apologist <bleep> FUD will change this.

Martrox, I will try to explain this for you:

The simple fact is that FP16 is part of DirectX 9. You can use it for textures, framebuffers, etc. to save on bandwidth for example when you don't need the precision of FP24/FP32. And you can use it in pixel shaders for calculations by using a precision hint, to speed up the pixel shader.

However, FP24 is the minimum precision that is required when a pixel shader operation doesn't include a precision hint.

You see? FP16 is simply part of DirectX 9, not only because nVidia is the only one supporting it. Microsoft has come up with it for other reasons than to satisfy nVidia.

KimB · Dec 31, 2003

nAo said:
So if one would ask me if PS2 supports VS I'd answer no: it's better than that

It's more flexible. That doesn't mean it's better.

Edit:
Let's take a little bit closer look at this.

Sony claims that the VU's can transform at a rate of 66 million polys/sec. That's roughly the peak transform rate of a GeForce3. That's not too bad, but there are a number of problems:

1. The VU's are notoriously hard to get to work in parallel. This made it take a very long time for the performance of the PS2 to get up to par. While this problem has been essentially solved over the life of the PS2, it would not be acceptable for a PC graphics card.

2. By contrast, since a GPU is assumed to always be working on vertex data within the vertex shader unit, parallelism is trivial. As long as you know how to avoid other bottlenecks in the system by, for example, using vertex buffers and batching large numbers of triangles, it is relatively easy to obtain (comparitively) close to peak performance from the start.

3. Transistor space: as an example, the PS2 vertex units support a divide function. GPU's don't. This saves lots of transistors. Once again, dedicated hardware saves transistors, which in turn improves performance.

In essence, the PS2's CPU was optimized for handling vertex data, but didn't go as far as to make it into dedicated hardware. This means that the PS2 has less vertex performance than it possibly could, with the benefit of programmers not being limited by a subset of available functions. Since that vertex unit is used for graphics work the vast majority of the time, Sony would have been better off just making a dedicated vertex unit.

Hellbinder · Dec 31, 2003

Chalnoth said:
nAo said:

Like Sony did with the PS2 VU0/VU1 almost 4 years ago (FP32 on vec4 registers and FX16 on scalar registers)
Except texture fetches I believe PS2 VS are quite capable of VS3.0 features..

Click to expand...

PS2 doesn't have vertex shaders. It has a vector processor. Big difference.

Actually it has 3 Vector Processors... hehehehe

Reading this thread is an excersise in Perseverance. whew... I think i need to lay down now...

Rolf N · Dec 31, 2003

Umm, PS2's VUs can also do PPP work, no?
I'd like to think that developers use them for much more than just plain old "vertex shading".

Hellbinder · Dec 31, 2003

Let me tell you guys a Story...

Nvidia "Dudes" sit down with John Carmack over coffee one day and have a nice little conversation about graphics, where they are going,,, what JC wants to do with his next engine... what would be really really Nifty to have in his next engine... which BTW is going to sell 149,999,999,99,9,9,9,9, Copies world wide and have spin off games and engine sales out the wazoo etc etc etc ...

After several Cups O' Joe a basic picture of what JC is really excited about starts to emerge...

1. I really want Floating point based shading just like in Toy Story..

2. I really, really want some way cool shadows dude...

(the above was roughly translated from nearly incomprehensible Techno banter)

Nvidia posse go off to the Can together and excitedly talk about how they could include all this and more in their next product and make a ****load of cash...

1. Toy Story uses FP16.. We can do that!!! Heck it will be easy!!

2. We will include a way to make shadows go like "Ultra fast"

3. Hell.. Well even go crazy and toss on a little more whoupass with a little FP32 on the side...

4. #$%#@ well make a special Language called CG that will have everyone doing things the Nvidia way Woohooo Weâ€™ll be Rich!!

(the Above Roughly translated from Greedy Green Sales Engineer Speak)

The Nvidia Guys Jump into there Volvos and Lexus and Scream off to NVHQ to start on their Rad plan for world domination...

*Meanwhile back at the ATi Ranch*...

Dudes.. Like What are we going to make our next hardware do and Stuff.. Hmm.. Lets Go Talk to our Good Buddies at Microsoft and see what they think...

After a all Day Gab session over a several Cups of Seattleâ€™s Best Coffee.. A picture starts to emerge..

1. Like we all agree that we want pure FP all the way dude

2. Like wow thatâ€™s a lot of Transistors.. Lets do FP24 for pixel shaders and FP32 for Vertex shaders becuase like thatâ€™s plenty for the next couple years.. This will be the Worlds First Pure FP bionic API

3. This is Great that we Seem to be on exactly the same page.. Well make a card that fits exactly what you want within a reasonable Degree for today..

(The Above Translated from very very hard to understand Microsoftease)

ATi dudes Jump into their SUV's and such and head off to crate their Wonder FP Machine... Thinking about all the Cash they will make on the near to countless DirectX games that will be released...

Suddenly as everyone at Microsoft is going home for the day... Little Billy from Nvidia screams into the parking lot on his Huffy Special edition Dirt Bike. (the posse was to busy creating their CinFX to make the Trip themselves so they sent one of the extra Dell interns hanging out at Nvidia replacing a harddrive that day) He Excitedly Runs into the building to Convey to Microsoft the wonders of Their Exciting new ideas and how they got them from Uncle Carmack. Unfortunately only the Night Custodian is there so Billy shares Nvidias Grand Vision with him.. (the custodian promises to convey the information in an Email first thing in the morning). Billy happily hops onto his Bike and races off in assured Victory....

A few weeks go by....

A few months Go by.....

The Nvidia CEO calls up the posse wondering what the Status on their Grand Scheme to control the world with DX and OpenGL via CG combined with their new Wonder Architecture is going.

The posse calls up Billy who assures them that everything is Fine the man with the mop and the Bucket told him so...

A Few months more go by...

Dx9 appears in alpha and beta form... and to the Nvidia posse's horror it looks like something designed by an alien race from planet X. They are so Horrified that they Jump Right into their Volvos and Lexus and Bum Rush off to Seattle.

After multiple Freak out sessions sadly, Nvidias Posse are Turned away.. because its to late.. DX is what it is.. Perhaps we can work a few things out later on a case-by-case basis. So sorry please come again..

Their Eyes actually briefly turning red the Posse eyeballs Poor Billy in the corner Snivelling and proceeds to Tear him limb from limb and leaves him bleeding in a ditch.. They get back to NVHQ and try to sneak in the back door.. But Unfortunately the CEO sees them...

Rumor is they are all still trying to get his foot out of their Collective asses...

But heck.. At least they Still have Their Good Buddy JC.

The Rest as they sayâ€¦. Is Historyâ€¦

Dio · Dec 31, 2003

Hellbinder said:
ATi dudes Jump into their SUV's

SUV's? There tends to be more interest in sports cars.

Except for hiring them to go up to Tahoe.

FP16 and market support

Bouncing Zabaglione Bros.

radar1200gs

DemoCoder

jimbob0i0

nAo

Nutella Nutellae

radar1200gs

KimB

Dio

KimB

DemoCoder

jimbob0i0

jimbob0i0

nAo

Nutella Nutellae

Demirug

sonix666

KimB

Hellbinder

Rolf N

Recurring Membmare

Hellbinder

Dio

Similar threads