QuadroFx1000 ( NV30GL ) news and pic

Arun · Jan 22, 2003

DaveBaumann said:
What you mean to say is that potentially an alternative architecture may gain performance by specifying a lower bit rate, since R300's pixel shader processor rate is constant - it will always operate a 96bits of precision per clock. However, you also have to be sure of what rate the alternative architecture actually executes 64/128bit instructions at.

Of course. That's a good point, and you're probably refering in part to that 3 instructions/pipe/clock thing on the R300. Yes, it'll be interesting to see how that turns out.

In a similar architecture, 64-bit would probably not be two times faster than 96-bit. But IIRC ( and I could be wrong on that ) , the R300 VS always work at 128-bit ( unlike its PS which works at 96-bit ) , so the GFFX 64-bit could be very useful against the R300 in geometry-limited situations. Which are so rare it's not really so useful, ah well...

Uttar

Xmas · Jan 23, 2003

Uttar said:
In a similar architecture, 64-bit would probably not be two times faster than 96-bit. But IIRC ( and I could be wrong on that ) , the R300 VS always work at 128-bit ( unlike its PS which works at 96-bit ) , so the GFFX 64-bit could be very useful against the R300 in geometry-limited situations. Which are so rare it's not really so useful, ah well...

There's no 'half float' mode in the VS, it's always 128-bit.

DemoCoder · Jan 23, 2003

But the NV30 does have a "clustered" FP unit architecture, so that using less than a 128-bit 4-tuple results in a performance boost.

Arun · Jan 23, 2003

Xmas said:
There's no 'half float' mode in the VS, it's always 128-bit.

CineFX documents all seem to indicate the NV30 got FP16 ( 64-bit ) support in *all* the pipeline.
However, after looking at the DX9 SDK again, it seems DX9 only supports "half" ( or rather, Partial Precision, that's the name the SDK gives for that ) in the Pixel Shader, and not in the Vertex Shader.

Could it be only OpenGL is able to use FP16 in the VS? Or am I just interpreting CineFX documents wrong?

Uttar

RoOoBo · Jan 23, 2003

http://www.theinquirer.net/?article=7371

KimB · Jan 23, 2003

I don't think half-floats in the VS are a possibility. The z-errors would be horrendous.

Arun · Jan 23, 2003

Chalnoth said:
I don't think half-floats in the VS are a possibility. The z-errors would be horrendous.

I agree half-floats might give some fairly bad rounding errors in the VS. However, my point simply is that I think the CineFX architecture allows it.
And remember half-floats can be used for only some parts of a VS Shader program, so you could use floats for Z and half-floats for some other things.

Uttar

nAo · Jan 23, 2003

Chalnoth said:
I don't think half-floats in the VS are a possibility. The z-errors would be horrendous.

That's nothing. Would you bound your world in a 1024 units wide cube? I wouldn't

ciao,
Marco

Basic · Jan 23, 2003

16fp could be excellent for inputs. It can be high enough precision for a model-local coordinate system. But the calculations is better done in 32fp.

But 16i would be even better input format in those cases. Or maybe even 4x8i.

Calculations in 16fp could perhaps be useful for vertex lightning, and maybe for normals (as long as the normal isn't used for reflective surfaces or specular lightning on shiny surfaces).

Colourless · Jan 24, 2003

Well 4*8u is fairly commonly used as packed colour data so that would be pretty much required to support it at least.

KimB · Jan 24, 2003

Basic said:
Calculations in 16fp could perhaps be useful for vertex lightning, and maybe for normals (as long as the normal isn't used for reflective surfaces or specular lightning on shiny surfaces).

Well, I would definitely hesitate to use it for normals, but I could definitely see applications for vertex lighting (thanks, didn't think of that!). In fact, I personally see no reason why the vertex lighting calculations shouldn't be done at 16-bit precision all the time. I wonder if we'll see significantly faster multiple-light polycounts on the FX due to this? I hope so!

And I kind of doubt that half-floats would be useful for input, except in limited demo situations (very small world).

Xmas · Jan 24, 2003

Basic said:
16fp could be excellent for inputs.

Besides the problem that the system that generates those inputs doesn't have native support for 16fp

KimB · Jan 24, 2003

Xmas said:
Basic said:

16fp could be excellent for inputs.

Click to expand...

Besides the problem that the system that generates those inputs doesn't have native support for 16fp

Wouldn't need to. The conversion could be done in the driver to lower bandwidth over the AGP bus. But I still think this wouldn't be all that great for today's games. Maybe for small demos.

Basic · Jan 24, 2003

Chalnoth:
Notice, model-local coordinate system.
If the full range of the coordinate system is just large enough to fit a small model (say, some pick-up item). Then even 8 bits per component can be plenty for vertices. The local coordinate system doesn't have to span the whole world.

Xmas:
If the halfs aren't generated in real time, it doesn't matter if there's any native support. The important part is that it should be standardized across gfx cards, and I believe it is. A modelling program could easily implement 16fp-modes, by "truncating" their floats while running, and then convert them when finished.

And you may actually generate the halfs in a system that has native support. (Render to vertex buffer.)

KimB · Jan 24, 2003

Basic said:
Chalnoth:
Notice, model-local coordinate system.
If the full range of the coordinate system is just large enough to fit a small model (say, some pick-up item). Then even 8 bits per component can be plenty for vertices. The local coordinate system doesn't have to span the whole world.

Well, that sort of thing would work very well, but I have doubts about how well it could be used. You would need to at least to some of the processing at 32-bit precision, if not all, so it seems the main gain would be in AGP bus bandwidth, but can that realistically be obtained?

Would also only be useful for dynamic objects in a 3D scene...it's better to have all static objects pre-transformed in world-space (according to Vogel, in the UT2k3 engine anyway).

Basic · Jan 24, 2003

Basic said:
But the calculations is better done in 32fp.

So we agree about that. The smaller vertex formats were just for input, to save memory space and/or bandwidth.

And yes it's best on dynamic models, where you have to change the transform matrix anyway. And even more so for morphing or key frame interpolated models, since many keyframes kan take up a lot of memory. Or how about mesh displacements.

Doomtrooper · Jan 24, 2003

Xmas · Jan 24, 2003

Doomtrooper said:

Is that a sign of surprise?

Doomtrooper · Jan 24, 2003

=Clock Throttling as some rumors have stated (similar to a P4)

Joe DeFuria · Jan 24, 2003

Great...does this mean that to be fair, we're going to have to run benchmarks at several different environmental conditions and lengths of time to get the whole picture?

QuadroFx1000 ( NV30GL ) news and pic

Arun

Unknown.

Xmas

Porous

DemoCoder

Arun

Unknown.

RoOoBo

KimB

Arun

Unknown.

nAo

Nutella Nutellae

Basic

Colourless

Monochrome wench

KimB

Xmas

Porous

KimB

Basic

KimB

Basic

Doomtrooper

Xmas

Porous

Doomtrooper

Joe DeFuria

Similar threads