FP16 and market support

Simon F · Dec 21, 2003

YeuEmMaiMai said:
FP16 was definately a bad descision as it took away from resources that could have been used to give NV30 better FP32 instruction execution speed.

I have a sneaking suspicion that the FP32 maths is not the limiting factor. Consider the way the NV chip behaves as you increase the number of temporary registers used in the PS program, i.e. how it slows down. This seems to indicate that there is some bottleneck with associated with internal storage. Using FP16 obviously doubles the effective number of registers and would therefore lead to a big improvement in performance. It wouldn't surprise me if the FP16 operations also goes through the FP32 hardware and merely get converted to FP16 at the very last stage.

Simon F · Dec 21, 2003

Chalnoth said:
akira888 said:

What I don't understand is if your entire pipeline is running at a FP32 precision why would you ever want to run at anything lower?

Click to expand...

Modern x86 processors do all non-SSE FP operations in 80-bit precision.

And what a damned nuisance it can be too. In the past I spent days looking for what I thought was a compiler bug because some numerical code ran perfectly on N different processors (with or without optimisations) yet failed on x86 when optimisation was enabled. Turned out to be the 80-bit precision failing an "equality" test.

Dio · Dec 21, 2003

Interesting point. I personally consider that in float math, A == B actually has a slightly subtler meaning, which is that A generally only equals B if they were calculated using identical codepaths.

This has come in useful on a few occasions

.

X2 · Dec 21, 2003

Chalnoth said:
For example, the R3xx architecture is capable of executing one texture one one FP op per clock (well, sometimes more, depending on the operations....like it can, apparently, execute a separate multiply and add in the same clock, not just a MAD). So, an optimal shader would be written with alternating math and texture operations.

The operations don't have to be alternating. But you should make sure that you have an about equal number of math and tex ops (depending on the filtering quality) per phase/dependent read level.

Dio · Dec 21, 2003

Nick said:
Thanks for correcting that, I really never heard of it...

We've never really felt the need to shout about it. It just seems a natural part of the compilation process to optimise the output code to me.

OpenGL guy said:
Nick said:

Anyway, trying to be objective, could it be possible that ATI's internal instruction set is very close to ps 2.0, so the optimizing compiler just corrects 'stupidities' of the programmer? For example it could reorder instructions to break dependecies, or detect when less registers can be used, or when a copy is not required? In that case, I can see the need for it, but it would still be unfair compared to the advanced optimization compiler Nvidia had to write and which will never be truely optimal.

Click to expand...

It does far more than that. As I said, we're working to improve it all the time. *poke Dio*

The R300 architecture doesn't need lots of advanced optimisations, but it does still benefit from them. There are definitely a few leaps coming (and I now might go as far as to say 'soon').

The biggest advantage we have is that we don't have to juggle umpteen different variables when deciding how to optimise a program. We just have one or two things and we know that if we minimise or maximise those we will get best performance.

indio · Dec 21, 2003

radar1200gs said:
Of course fully supporting all the features of something you are trying to test is never a bad idea either (partial precision IS a part of DX9).

I agree completely! EVERY DX9 feature should be used!
Maybe they should add some high-dynamic range lighting considering it's part of the DX9 . FM can just subtract points for lacking a feature. I don't think you wanna go down that road with an FX card.

YeuEmMaiMai · Dec 21, 2003

chalnoth,

You just fail to see the point now do you.....

1. The GFFX is an excellent card for DX8.X no doubt about it No one here will deny it

2. ATi Built the R300 with DX9 as it's PRIMARY focus

With R300 any other version of DX as an afterthought while on the other hand it is QUITE clear the intentions of Nvidia with the NV30 it was built for superior DX8 performance and just to get their foot in the door with DX9 features The NV30 had INT units where as R300 does not and relies exclusively on FP24 units to do all of the work.

3. Nvidia has yet to correct the performance issues with the NV3x series, instead they have to resort to a lower percision in order to try and maintain competitiveness with ATI.

When I buy a card I want the best possible gaming experience for the money spent. Having most games work right out of the box with my 9500Pro is what makes me happy. Hearing developers sell out to Nvidia and having features that only work on their cards because they cannot follow the DX9 spec is bad (why because Nvidia is spending their cash on buying out people instead of fixing their hardware)

Are you really happy that Nvidia is trying their best to split the industry that is supposed to be UNIFIED under OGL or DX?

DX and OGL are a GODSEND becuase it allows for better games and FORCES the hardware vendors to play on a level field.

Basic · Dec 21, 2003

OT:
Simon F / Dio:
I once wrote a arithmetic coding module (arithmetic coding == huffman++). The first version was just for experimenting with the algorithm. It looked clean when using floats, it worked perfectly, and rounding errors was no problem. But it wouldn't be cross-plattform safe.

I had to replace the float calculations to "homemade floats" using integers to guarantee it to work cross-plattfom. As I said, rounding errors were no problem, because compression and decompression used identical code paths, and the algorithm was self-correcting for rounding errors. -As long as they were identical on the computer doing compression and the one doing decompression.

3dcgi · Dec 21, 2003

Bouncing Zabaglione Bros. said:
Do you really think that Nvidia and ATI invested tens of millions in designing new, cutting edge DX9 parts without knowing far in advance what the major parts of the DX9 spec would be? Don't you think they had a 18 months heads up on the rest of us. Do you think either of these companies would spend huge sums of money on this kind of R&D with Microsoft telling them "we have a spec, but we won't tell you what it is for a couple of years?"

I think you're giving Microsoft too much credit. Microsoft doesn't say "we won't tell you the spec" they likely just don't make up their minds that quickly. Ati and Nvidia start their designs with an idea of what the API will look like and they make adjustments later in the design process, as necessary. The key issue with DX9 was likely whether to go with a mandatory dual precision model or to only require a single minimum precision.

see colon · Dec 21, 2003

I agree completely! EVERY DX9 feature should be used!
Maybe they should add some high-dynamic range lighting considering it's part of the DX9 . FM can just subtract points for lacking a feature. I don't think you wanna go down that road with an FX card.

a more likeley scenario would be for FM to add a feature test (ie one that does not effect the 3dmark score) like they did with 2001's advanced shader test. they could either add a totaly new shader test that tests partial and full percision speeds, or just make a pp version of the existing ps2 test.
c:

ET · Dec 21, 2003

StealthHawk said:
radar1200gs said:

The XGI/Volari solution is a complete and utter joke.

Click to expand...

What pray tell makes DeltraChrome less of a joke?

Well, it's not out yet, and we can't very well joke about a solution that isn't available yet, can we?

Once it arrives, we'll be able to joke about it like we laughed at the Volari and the NVIDIA cards (FX 5200, hahahaha!).

radar1200gs · Dec 21, 2003

indio said:
radar1200gs said:

Of course fully supporting all the features of something you are trying to test is never a bad idea either (partial precision IS a part of DX9).

Click to expand...

I agree completely! EVERY DX9 feature should be used!
Maybe they should add some high-dynamic range lighting considering it's part of the DX9 . FM can just subtract points for lacking a feature. I don't think you wanna go down that road with an FX card.

Whats wrong with that road? I happen to agree with you. Sure nVidia will lose some tests at first.

At the moment 3DMark03 ought to be renamed ATiMark03...

FUDie · Dec 21, 2003

radar1200gs said:
At the moment 3DMark03 ought to be renamed ATiMark03...

Then you should rename 3DMark01 "nvidiaMark01" since it used only features that nvidia supported.

-FUDie

radar1200gs · Dec 21, 2003

Pixel shader V1.1 was the base for all pixel shaders in DX8 and universally supported.

The fact that ATi chose to concentrate on PS V1.4 at the expense of all other pre-existing versions is hardly nVidia's fault.

Althornin · Dec 21, 2003

radar1200gs said:
At the moment 3DMark03 ought to be renamed ATiMark03...

how do you figure?
because nVidias cards suck at ALL PS 2.0?
Even if pp hint is used?

BRiT · Dec 21, 2003

radar1200gs said:
The fact that ATi chose to concentrate on PS V1.4 at the expense of all other pre-existing versions is hardly nVidia's fault.

Likewise with Nvidia concentrating on DX8.x or substandard-DX9 at the expense of standard-DX9 is hardly the fault of Microsoft or ATI or FutureMark.

Neeyik · Dec 21, 2003

FUDie said:
radar1200gs said:

At the moment 3DMark03 ought to be renamed ATiMark03...

Click to expand...

Then you should rename 3DMark01 "nvidiaMark01" since it used only features that nvidia supported.

-FUDie

Actually, those particular parts of 3DMark2001 were written with a company other than NVIDIA in mind - hint: they didn't last long enough to display this...

Dave Baumann · Dec 21, 2003

radar1200gs said:
Pixel shader V1.1 was the base for all pixel shaders in DX8 and universally supported.

The fact that ATi chose to concentrate on PS V1.4 at the expense of all other pre-existing versions is hardly nVidia's fault.

Errr, I think his point is you can't have it both ways. Virtually exactly the same has happened with 3DMark03 previously happened for 3DMark2001, except with the show on the other IHV's foot.

Your saying "The fact that ATi chose to concentrate on PS V1.4 at the expense of all other pre-existing versions is hardly nVidia's fault" is exactly the same as saying "the fact that NVIDIA chose to concentrate on PS/VX2.0 extended at the expense of all pre-existing PS/VS2.0 versions is hardly ATI's fault".

radar1200gs · Dec 21, 2003

Except that PP (that "substandard" part of DX9) is part of the DX9 spec.

Dave Baumann · Dec 22, 2003

Its an optional part of the specification. The only thing that isn't optional wrt precisions is having support for at least FP24 precision.

FP16 and market support

Simon F

Tea maker

Simon F

Tea maker

Dio

X2

Dio

indio

YeuEmMaiMai

Basic

3dcgi

see colon

All Ham & No Potatos

ET

radar1200gs

FUDie

radar1200gs

Althornin

Senior Lurker

BRiT

(>• •)>⌐■-■ (⌐■-■)

Neeyik

Homo ergaster

Dave Baumann

Gamerscore Wh...

radar1200gs

Dave Baumann

Gamerscore Wh...

Similar threads