Has FP24 been a limitation yet?

OpenGL guy said:
What does any of this have to do with FP24 vs. FP32 in the pixel shader? All R3xx/R4xx products use FP32 in the vertex shader and it's vertices that define a large world, right? Please explain how FP24 pixel shaders somehow limit your "bigger world".
I could think of two possible situations where FP24 in pixel shaders would affect vertex shaders:
1. Unified pipelines.
2. Render to vertex buffer/texture.
 
Given the lack of "evidence" for the need of FP32 pixel shaders for gaming, I'm thinking it's unfortunate that FP32 is part of the SM3.0 spec.

If what I've read is generally true, that FP32 "costs" about 30% more (die space) to support vs. FP24, then I'd have to say I'd rather see those gates go toward other features or even simply more performance. (Or hell, just a cheaper, less power consuming part).

I get the feeling that ATI is being "forced" to support FP32 for no other real reason than SM 3.0 requires it, and it would be more marketing suicide than anything else to not support SM 3.0 next year.

I would rather have seen SM 3.0 kept the "full precision" of at FP24....and if someone makes a FP32 or hell, FP48 part, demonstrate the quality advantage over FP24 in a game. SM 3.0 shouldn't srtificially limit shader precision, but at least for games in the forseeable future, I don't see the costs of FP32 outweighing the benefits.
 
A unified shader architecture will definitely require FP32. It is already the minimum for vertex shaders, so if an IHV wants to use the same ALUs for either processing, they'll have to have full FP32 support. I can't see any way around it. Partial precision might still be supported, but that will be the only possibility for sub-FP32 precisions.
 
Joe DeFuria said:
If what I've read is generally true, that FP32 "costs" about 30% more (die space) to support vs. FP24, then I'd have to say I'd rather see those gates go toward other features or even simply more performance. (Or hell, just a cheaper, less power consuming part).
I don't buy it, for the simple reason that if you compare the NV40 to the R420, the difference is about that, but the NV40 offers so much more than just FP32 support over the R420.
 
Chalnoth said:
I don't buy it, for the simple reason that if you compare the NV40 to the R420, the difference is about that, but the NV40 offers so much more than just FP32 support over the R420.

I would never begin to compare two different chips from two different companies and try to guess at such a thing.
 
And I wouldn't take ATI's word about 30% either because yhey have an agenda and didn't want the change.
 
Xenus said:
And I wouldn't take ATI's word about 30% eit5her because yhey have an agenda and didn't want the change.

Of course.

I know for CERTAIN that it costs "more" to support FP32 than it does FP24. 10%, 30%, 50%. Pick one.

I also have not been shown ANY convincing case where FP32 pixel shaders makes a quality difference for gaming. (At least, for the near to mid-term.)

The only reason I can see it being "beneficial" to a gamer, is when it's a natural consequence to the the API specs "in effect" requiring a "unfied shader". That doesn't look to happen until probably WGF 2.0.

Just because one company may not have wanted to make the change...doesn't mean their interests / agenda doesn't overlap with my own agenda! :) (My agenda being, better and better gaming cards...more value for my dollar for gaming).
 
I'll be sure to do it for the marketing reasons. New unique features etc. helps sell more HW. Remember the first GeForce and T&L (well, almost T&L)? It was also not really good, but it did set the path.
 
Joe DeFuria said:
I also have not been shown ANY convincing case where FP32 pixel shaders makes a quality difference for gaming. (At least, for the near to mid-term.)
And I feel that video cards should be forward-looking. As texture sizes continue to increase, for instance, it would be rather disappointing to suddenly realize that artifacts become visible in dependent texture reads that weren't there for normal texture reads.
 
Chalnoth said:
And I feel that video cards should be forward-looking.

So do I. There is a difference between "forward looking" and "ahead of its time." Agree?

As texture sizes continue to increase, for instance, it would be rather disappointing to suddenly realize that artifacts become visible in dependent texture reads that weren't there for normal texture reads.

Sure, it would also be equally disappointing to find out that artifact-free dependent texture reads are so slow to be prohibitive with any meaningful implementation.
 
If you need to place vertices in your world accurate down to an error of +/-1cm, FP24 allows a world that is about 5 kilometers wide (assuming a coordinate system origin at the center of the world),
You're assuming all transformations on those vertices are 'perfect' (ie: there is no precision loss). This is not usually the case. If you assume that you lose 0.5 LSB per basic operation (MUL or ADD), that you do your modelview-projection transform in a single matrix multiply, and that your original 5 km estimate was correct, then your world can only be 70m wide for a hypothetical GPU with FP24 vertex shaders.
 
Chalnoth said:
And I feel that video cards should be forward-looking.

I feel the same. But when consideirng the costs mentioned in this thread and the lack of evidence for FP32's benefits, why spend resources on it when you can make a reasonably good compromise with FP24? The sole purpose of it being foward-looking isn't good enough, in my opinion. It's similar to the whole native PCI-Express vs. AGP Bridge situation. Sure, ATI's native PCI-Express may be better in the longrun, but nVidia's AGP-to-PCI-Express bridge is just as good for now, not to mention cheaper. Had they went on to make a forward-looking native interface for the sake of making it forward-looking, more costs would have been required for no practical performance benefit.
 
Chalnoth said:
OpenGL guy said:
What does any of this have to do with FP24 vs. FP32 in the pixel shader? All R3xx/R4xx products use FP32 in the vertex shader and it's vertices that define a large world, right? Please explain how FP24 pixel shaders somehow limit your "bigger world".
I could think of two possible situations where FP24 in pixel shaders would affect vertex shaders:
1. Unified pipelines.
What? FP32 is required for vertex shader operations. And this doesn't apply to any current product. I thought we were talking about current products? Hypothetical (and stupid) products have no interest to me.
2. Render to vertex buffer/texture.
Render to VB is not supported by the API. I don't see how render to texture is affected and differently than normal rendering.
 
OpenGL guy said:
2. Render to vertex buffer/texture.
Render to VB is not supported by the API. I don't see how render to texture is affected and differently than normal rendering.
Here I'm talking about a FP32 texture that would be used in the vertex shader. This, then, would be an argument for those stating that the FP32 requirement for PS3 is ridiculous.
 
Chalnoth said:
OpenGL guy said:
2. Render to vertex buffer/texture.
Render to VB is not supported by the API. I don't see how render to texture is affected and differently than normal rendering.
Here I'm talking about a FP32 texture that would be used in the vertex shader.
You really think that you need the full range of FP32 for vertex textures?
This, then, would be an argument for those stating that the FP32 requirement for PS3 is ridiculous.
Not at all. What it argues against is the need for FP32 vertex textures.
 
Reverend said:
nelg said:
Reverend, are you suggesting that, with all else being equal and if ATI supported FP32, games may appear different than they do today (referring to DX9 games)?
That's a loaded question and one I do not want to dare risk answering (because I'm not in the business of selling games). Who knows what the ATI folks posting here would've responded if the R300 had reasonable FP32 performance...
That's actually pretty trivial to do -- Moving an R300 to FP32 would of simply required upping the 24b DP units to 32b (it's really just a substitution -- we have FP32 units all over as well) and increasing GPR space to 32b. It's really simple 30%+ increase in area (GPRs have 30% more space reqs, adders are pretty much 30% larger, while multipliers are larger than 30% more).

The problem is not getting good FP32 performance in R300 (reasonably easy), it's making it cost effective. That's hard. 30% more for pixel shaders is quite a bit of extra area, which, for all pratical and useful purposes, is a simple waste. You can reduce something else, or increase the cost. I think it's fair to say that if we reduced "something" else, then performance would of been affected and that's not acceptable. Increasing the cost, well, that's something that one has to decide upon. We were given a target cost, and we sort of went past it; another 30% growth would of meant less parts out (9700 at 499 back then, would not of been cool) or smaller margins -- These are not acceptable.

I think we have to look at a wider picture (that some probably hadn't thought of) -- that of the business of making games. Games such as space-constrained first-person-shooters (perfect latest example is Doom3) atm gain very little from FP32. This is not due to lack of 3D knowledge from such game programmers but from probably a collective lack of creativity from a single development house. The trend appears to be that gamers want bigger "worlds" to play/explore, and this applies to first-person-shooters as well. Bigger worlds has much bigger demands. Higher precision counts, as an example of such demands. Creating a game engine that works well in small and large worlds, with the thought of licensing business, should be the smart move -- the costs and complexity would be too high for a programmer to make a game engine that only applies to small worlds. If it isn't obvious, what I'm saying is that one of the things where higher precision is a big deal is a game set in a large world, which IMO is the way gamers like their games to be.

Bigger worlds really has no effect on PS precision requirements. You can construct a world so that it is divided into regions of interests, so that within those regions, limits of PS (I can't even think of many cases where PS precision matters; possibly for some large repeating textures) and Z aren't important. In all those cases, geometry is computed in 32b SPFP, and doesn't involve 24b.

What does the majority of the folks here know about the importance of "floating point" itself?

PS. I've just had some beer, so excuse me if I appear to be going OT.

I hope it's good beer. All in all, I still believe that for current applications FP24 is more than ample, and very few things would of required, if any. In fact, feedback from ISVs has been great on R300 and now R420 -- Easy to program, very predictable performance, clean orthogonal feature set (wrt to performance), etc... -- Compared to the competition (at least last gen), it's been night and day. I don't remember one complaint about FP24. BTW, FP24 was not an ATI only decision. MS specified that FP24 was the high precision format for DX9 (SM2) -- It's really not an ATI only decision. You at least need to blame everyone involved (including potentially all IHVs that agreed to it), if you're so against it.
 
nelg said:
Here is the only example that I know of that shows the difference in IQ between FP24 and FP32. LINK . At the bottom of the page.

BTW Reverend what ever happend to your PS research ?
There should be more, as I've already discovered them in my PS research so far, but I'll prefer to present them in an official B3D article in the future, if I can find the time to complete my research and write it all down.

I'm sorry, but that's not an example of FP24 limitations. Comparisons between completely different rasterizers (refras and ATI's), will lead to tons of differences. That's why DCT/WHQL specifies tolerances. There's no way the two can be the same, unless the rasterization algorithm (including setup, Scan conversion, iteration, etc...) is identical in both. I hope this never comes to be, though who knows in the future.
 
Chalnoth said:
Joe DeFuria said:
If what I've read is generally true, that FP32 "costs" about 30% more (die space) to support vs. FP24, then I'd have to say I'd rather see those gates go toward other features or even simply more performance. (Or hell, just a cheaper, less power consuming part).
I don't buy it, for the simple reason that if you compare the NV40 to the R420, the difference is about that, but the NV40 offers so much more than just FP32 support over the R420.

As well, I did not mean 30% die increase, but 30% shader increase. That's simple enough math that it's hard to not believe (24 -> 32 is 30% more storage, approx ~30% more adder, and more than 30% more multipliers). The die size increase is not something I'll specify.

As for NV40 vs. R420, that's ridiculous to compare die size. You're better off just trying to find out the company's per chip revenue and margins -- That's the best metric that I could think of to compare the two products, but you will not be able to get that information from either company.
 
Joe DeFuria said:
Given the lack of "evidence" for the need of FP32 pixel shaders for gaming, I'm thinking it's unfortunate that FP32 is part of the SM3.0 spec.

I agree with that. That's my personal opinion, and nothing to do with ATI.

I would rather have seen SM 3.0 kept the "full precision" of at FP24....and if someone makes a FP32 or hell, FP48 part, demonstrate the quality advantage over FP24 in a game. SM 3.0 shouldn't srtificially limit shader precision, but at least for games in the forseeable future, I don't see the costs of FP32 outweighing the benefits.
 
Xenus said:
And I wouldn't take ATI's word about 30% either because yhey have an agenda and didn't want the change.

Well, I never said 30% more die -- but I'm dumbfounded to think that you would not believe that a FP32 shader would be at least 30% larger than an equivalent featured FP24 shader.
 
Back
Top