Has FP24 been a limitation yet?

nelg

Veteran
After re-reading this thread I was wondering, now that Dx9 games are reasonably plentiful, has FP24 (in the PS) been proven to be a limitation yet? In the thread linked it was sireric’s contention that FP24 was enough for the Dx9 time frame. Reverend was concerned that it might not be. So are there any real world examples were it had been a limitation?No need to rehash the technical arguments. They were all pretty much covered in the linked thread.


P.S. The thread did produce some great quotes...
sireric said:
All I meant is that more complex operations do not have results that are guaranteed by IE^3. The implementation details influence the results. If a PowerPC implements an FMAD with higher precision than MUL/ADD combo, that will lead to slight differences between that HW and other. Doesn’t seem to offend most programmers.

Though there are some that require the exact same results. But they aren't programming pixel shaders
 
I have no qualifications to answer this, but I feel like it anyway. And no, I didn't stay at a Holiday Inn last night. :D

I doubt there are any "real world examples" showing problems with FP24 besides those that were specifically designed to show shortcomings -- and I think at that point, they would no longer be considered "real world examples".

Have people run into limitations? I have no idea, but if they had, they would have most likely found another way to get their desired result.
 
Didn't Sweeney say he was seeing FP24-based artifacting in his next engine? Not sure if that counts as "real world" yet, tho.
 
I think the biggest artifacts will come with using textures in the vertex shader, but unfortunately ATI doesn't have that capability, so we can't really find out.

Has anyone played around with vertex texturing on NV40? FP16 seems like it would be woefully inadequate, but discretization is probably only noticeable at slow object velocities, close up. If people have found FP16 to be usable there in most practical situations, going from FP24 to FP32 shouldn't matter. In cloth simulation you might see the underlying object poke through now and then.

Those Half-Life 2 screenshots with FP16 errors seem to be the most precision intense pixel shaders I've seen that come from practical usage, and FP24 seems fine. Of course things like Mandelbrot sets will show problems, but we're talking real-world here, right?
 
Mintmaster said:
Those Half-Life 2 screenshots with FP16 errors seem to be the most precision intense pixel shaders I've seen that come from practical usage, and FP24 seems fine. Of course things like Mandelbrot sets will show problems, but we're talking real-world here, right?
Yes real world. :)
Entropy made a good point in the listed thread
Entropy said:
A small comment from the scientific computing field.

Code that critically depend on the minutiae that Reverend brings up is effectively broken. You should never, ever write anything which makes those kinds of assumtions.

Assuming rounded rather than truncated results is pretty much as far as you can hope for. If you _need_ control, you should explicitly code for it, never leave it to the system to take care of for you.

Now, in scientific computing, codes tend to have very long life and get ported all over the place, and is thus probably a worst case, but generally the experience should carry over.

Sireric explained nicely why FP24 is a good compromise for the tasks we ask of this hardware. If you do something else though and need fp32, by all means buy whatever supports it. But making the product significantly slower/costlier for some hypothetical benefit just doesn't make sense. The very same tradeoffs have been made on the CPUs you are currently running on.

BTW, the above should in no way be construed as endorsing general sloppyness when defining computational tools. From personal experience, I do however endorse extreme suspiciousness on the part of programmers as far as these issues are concerned. "Just don't count on it."
 
Mintmaster said:
I think the biggest artifacts will come with using textures in the vertex shader, but unfortunately ATI doesn't have that capability, so we can't really find out.
Even if ATI did have vertex texturing, doesn't ATI support FP32 precision in the vertex shaders?
 
First of all, I'm surprised not a single comment/opinion was offered in a related thread I started recently.

Thread Title said:
Has FP24 been a limitation yet?
Since the introduction of DX9 hardware, ISVs work with the lowest denominator. There is no such thing as whether a feature or spec is a limitation when ISVs have to work according to available hardware. The first DX9 hardware is the R300 -- ISVs cannot ignore that.

DX9 games may be reasonably "plentiful" right now and not a single one of them may have demonstrated why more than 24-bits FP is required... but that's due to the above paragraph. Not due to lack of knowledge by programmers of what's needed for PC 3D graphics to progress.

In any case, this is a simple-looking question with the possibility of complicated answers. ISVs aren't stupid enough to make a game that requires nothing but full SM3.0 precision (32-bits), regardless of whether NVIDIA may have tried to be extremely persuasive.

In the thread linked it was sireric’s contention that FP24 was enough for the Dx9 time frame. Reverend was concerned that it might not be.
1) sireric has a right to say that, and that is probably because the R300 is the first DX9 hardware out. The jump from fixed function to floating point is a big one and I agree with this basis of sireric's and Dave's "arguments".
2) I think you (and perhaps many others) misread my comments in that thread -- that's not exactly the kind of discussion (i.e. FP24 good enough for games) I was trying to promote. I was trying to figure out why ATI did not or could not go with 32-bits with the R300, given that the NV30 has it (although we know what this meant in terms of performance for the NV30) and given that the R300 really was swell with FP24, when what we know of the progress of the DX9 specifications development and that of the existing IEEE-754. I am not a hardware engineer, so perhaps folks didn't realize this... at the time of the thread, I have no idea how expensive it is silicon-wise for an extra 8-bits ! :)

One thing that intrigues me is whether -- during the (perhaps table-banging) discussions that goes on with MS and various IHVs in next-version-of-DirectX-shaping meetings -- each IHV's knowledge and capabilities to take advantage of (in a very, very proficient and efficient way) current and soon-to-be-available process technology are brought up to MS. Which IHV has the bigger brains, so to speak :) . I still have absolutely no idea how the specification process of each DX version starts, progress and ends.

[edit] PS. I should also add I have absolutely no idea about the interactivity, relationship and influence of ISVs who are involved in the DX spec brain-storm committee. All I know is that ISVs usually demand more than IHVs can provide :)
 
nelg, not sure what tone you thought I was conveying, but I agree with you. The times FP24 fails will be few and far between in the real world. Even when NVidia had farcry precision problems, the outcry was pretty minimal, so I doubt ATI is sweating that they'll pay for only using FP24. It was a smart decision.

pat777 said:
Mintmaster said:
I think the biggest artifacts will come with using textures in the vertex shader, but unfortunately ATI doesn't have that capability, so we can't really find out.
Even if ATI did have vertex texturing, doesn't ATI support FP32 precision in the vertex shaders?
Yeah, but just taking a static texture and reading it in the vertex shader doesn't do much for you. You might as well just store the info in the original vertex stream. The real power of vertex texturing is when you use pixel shader output to manipulate vertices. Your texture will be undergoing water or cloth simulation in the pixel shader, and then used to displace vertices.

Still, I think it'll take a pretty exotic application for FP24 to be inadequate even here. It's just that I'm willing to acknowledge a possibility here of FP24 not being enough, as some types of physics calculations can be quite sensitive to discrete jumps.
 
Mintmaster said:
nelg, not sure what tone you thought I was conveying, but I agree with you. The times FP24 fails will be few and far between in the real world. Even when NVidia had farcry precision problems, the outcry was pretty minimal, so I doubt ATI is sweating that they'll pay for only using FP24. It was a smart decision.

I think Rev's point is that the above is a circular argument. It was a "smart decision" because they were --by a largish margin-- first to market, and --if rumors are to be believed-- the reference card for DX9. If I perceive his argument correctly, this lead developers to stay within FP24 limits for practical reasons, rather than what they could do if FP24 limits weren't in place. If ATI had decent performance FP32 in R300 it would just have shifted the "practical bottleneck" that Rev is pointing at to NV --but it still would have existed.

I don't know if he's right, but it is an interesting argument. I don't think it takes entirely into consideration the performance limitations of NV FP32 anyway as a major disincentive for developers pushing that envelope for that generation.
 
Mintmaster said:
nelg, not sure what tone you thought I was conveying, but I agree with you. The times FP24 fails will be few and far between in the real world. Even when NVidia had farcry precision problems, the outcry was pretty minimal, so I doubt ATI is sweating that they'll pay for only using FP24. It was a smart decision.
Mintmaster, I did not assume you responded in any particular tone and I do appreciate your input :D .

Reverend said:
First of all, I'm surprised not a single comment/opinion was offered in a related thread I started recently.
That's because you need the services of a headline writer. ;) Seriously, now with the benefit of hindsight it seems that the decision to utilize FP24 vs. FP32 was a wise one. I also was surprised at how much those extra 8bits added to the transistor budget. The funny thing about this search for examples is that you (in a old thread) were the only one how could provide an example (IIRC the water in TR) of a limitation. Anyways thanks for the input as well.
 
geo said:
I think Rev's point is that the above is a circular argument. It was a "smart decision" because they were --by a largish margin-- first to market, and --if rumors are to be believed-- the reference card for DX9. If I perceive his argument correctly, this lead developers to stay within FP24 limits for practical reasons, rather than what they could do if FP24 limits weren't in place. If ATI had decent performance FP32 in R300 it would just have shifted the "practical bottleneck" that Rev is pointing at to NV --but it still would have existed.

I don't know if he's right, but it is an interesting argument. I don't think it takes entirely into consideration the performance limitations of NV FP32 anyway as a major disincentive for developers pushing that envelope for that generation.
This is were the debate gets interesting. IMHO, extra precision faces the law of diminishing returns. The analogy that comes to my mind is like that of a race car. If your car has 300hp and can reach 200kph, 600hp is not going to get you to 400kph. Speed here is like precision. It is not just how fast you can go but also what is needed to support it, like better suspension and tires.

sireric said:
Don't get me wrong. Certainly at some point it will be required. But, from the analysis I showed, it will require larger textures, filtering on FP texture, probably native FP frame buffer as well as the development of procedural type shading.
 
nelg said:
This is were the debate gets interesting. IMHO, extra precision faces the law of diminishing returns.

Well if we get to the point where calculations are done on the video card where the same data is being iterated on then higher precision will be needed. A practical example of that would be simulating blood slowly oozing on a surface (BloodShader) or simulating water physics with textures. But in general in those cases as long as game developers ensure that effects get smaller with time instead of increasing that single floating point precision is plenty.
 
Reverend said:
First of all, I'm surprised not a single comment/opinion was offered in a related thread I started recently.

Thread Title said:
Has FP24 been a limitation yet?
Since the introduction of DX9 hardware, ISVs work with the lowest denominator. There is no such thing as whether a feature or spec is a limitation when ISVs have to work according to available hardware. The first DX9 hardware is the R300 -- ISVs cannot ignore that.

DX9 games may be reasonably "plentiful" right now and not a single one of them may have demonstrated why more than 24-bits FP is required... but that's due to the above paragraph. Not due to lack of knowledge by programmers of what's needed for PC 3D graphics to progress.

Are you so sure? Could it be that FP24 is ample enough for most computations, and that the true limitations have more to do with the source data sets? Yes, they could use float textures, but they also want to ship games in less than a few CDs.

I still think that for all games nowadays, the real limitations are still the source data, and final displayable render targets (well, maybe also 6b LCD displays). While there are many things that could be done with 32b, few of them end up being for games.

The "extra 8b" is really a 30% growth on the shader cores. The multipliers go from 17b to 24b, the storage grows from 24b to 32b, etc... -- It's a 30% growth. For something that I still feel it not required for today's games. When we move to deeper displayable formats, have new compression formats for float, and do a lot of multiple pass rendering, I think then that 32b will be required. Or for other types of things, related to "VPU as CPU" type tasks (fluid dynamics, etc...). Possibly for indirect textures of very large textures to, especially if the data varies a lot (of course, there will be many other aliasing issues before that).

While the future is going to be 32b one day, and more perhaps in the future, I think for DX9 commercial level software, in general, it's fine.

Edit: The above is my opinion, and cannot be taken to be ATI's position (it might or might not be).

1) sireric has a right to say that, and that is probably because the R300 is the first DX9 hardware out. The jump from fixed function to floating point is a big one and I agree with this basis of sireric's and Dave's "arguments".
2) I think you (and perhaps many others) misread my comments in that thread -- that's not exactly the kind of discussion (i.e. FP24 good enough for games) I was trying to promote. I was trying to figure out why ATI did not or could not go with 32-bits with the R300, given that the NV30 has it (although we know what this meant in terms of performance for the NV30) and given that the R300 really was swell with FP24, when what we know of the progress of the DX9 specifications development and that of the existing IEEE-754. I am not a hardware engineer, so perhaps folks didn't realize this... at the time of the thread, I have no idea how expensive it is silicon-wise for an extra 8-bits ! :)

It's simply cost in this case. Of course we have the technology. We felt (and still feel) that we design to a sweet spot. Don't overdesign just to make a few people happy (or one). Design a balanced system that maximizes performances & quality, while minimizing costs. As for NV30 having 32b, well in a theoretical sense it did support 32b, but it was not usuable in most real world circumstances. To me, that's not a balanced design.

One thing that intrigues me is whether -- during the (perhaps table-banging) discussions that goes on with MS and various IHVs in next-version-of-DirectX-shaping meetings -- each IHV's knowledge and capabilities to take advantage of (in a very, very proficient and efficient way) current and soon-to-be-available process technology are brought up to MS. Which IHV has the bigger brains, so to speak :) . I still have absolutely no idea how the specification process of each DX version starts, progress and ends.

MS listens to all IHVs and ISVs for DX development. But you'll have to ask them how they make their decisions. I'm sure all IHVs lose a little and all win a little. In the case of FP24, I think MS and ATI both realized that this was likely to be a good balance of performance and quality and cost.

[edit] PS. I should also add I have absolutely no idea about the interactivity, relationship and influence of ISVs who are involved in the DX spec brain-storm committee. All I know is that ISVs usually demand more than IHVs can provide :)

That's actually not true. ISVs demand the items that they want; not just random stuff or the moon. They usually know how their next game will work, and have some good ideas about specifics that they want for that. It also tends to match a lot of the papers presented at Siggraph, for example. .They are realistic and must ship real games that works on real HW.
 
sireric said:
Are you so sure?
I don't talk to ALL the developers that your company (or any other IHV) does but the ones that I do talk to are pretty good when it comesa to their knowledge of 3D graphics.

I still think that for all games nowadays, the real limitations are still the source data, <snipped> ... but they also want to ship games in less than a few CDs.
Debatable on both counts but your points are good (obviously).

While there are many things that could be done with 32b, few of them end up being for games... <snipped> ...While the future is going to be 32b one day, and more perhaps in the future, I think for DX9 commercial level software, in general, it's fine.
You appear to be as knowledgeable about where software (games) can be going as you are about hardware.

Can you expand on why you think many things could (not should?) be 32b but not for games? So 32b really should be all that is/will ever be required for games?

How much should a IHV try to improve games 3D quality?

I'm not trying to be "smart" here but your comments appear to be "conclusive" in nature in several aspects.

FP32 will not be enough for games. Your comments appear otherwise, in a declaration-type statement. Unless, of course, you're talking about the necessity for the advancement in hardware (not video cards) on the PC platform before we start talking about advancement of video cards.

Of course, this could just be about priorities. I may decide to make a game where I have a personal conviction that anything less than FP32 just won't do for me (the old "I'm my worst critic" thing). And then the DevRels come in and convince me otherwise. Because I need to sell games, not push the graphics of games!
 
Reverend said:
sireric said:
Are you so sure?
I don't talk to ALL the developers that your company (or any other IHV) does but the ones that I do talk to are pretty good when it comesa to their knowledge of 3D graphics.

I never said that they were not knowledgeable about 3d graphics. I'm saying that the 24b limitations of the pixel shader have not been the issues that the developers have been complaining about. If you look at some of the work recently presented at Siggraph or some of the work ISVs are talking about in the future, there are many other problems that need addressing above this one. Until either the source and destination formats expand or that unstable shader code becomes a requirement, I do not think FP24 generally be a limit. I can't speak for 2 to 5 years down the road, but for now that appears to be true.

You appear to be as knowledgeable about where software (games) can be going as you are about hardware.
Sarcasm noted. Amusing. End of thread for me.

Can you expand on why you think many things could (not should?) be 32b but not for games? So 32b really should be all that is/will ever be required for games?

For applications that do a lot of procedural operations, such as some of the CPU/VPU items (fluid dynamics, linear algebra solutions, etc...) that I've seen, 32b seems required to be useful (in fact, in some cases it's not enough). However, there aren't enough commercial apps in this category to justify this for the mainstream commercial market (might be a chicken-egg thing though).

As for games, most of them have more issues with source data quality than the stability and precision of intermediate computations. The popular algorithms of today are rather stable mathematically, and the source data precision ends up being more of the issue (i.e. normalizing 8b source vectors doesn't require 24b precision).

How much should a IHV try to improve games 3D quality?

I'm not trying to be "smart" here but your comments appear to be "conclusive" in nature in several aspects.

How much should an IHV try to improve quality? As much as is possible, while making a balanced product with a mainstream target audience. We can make huge chips that cost thousands of dollars (i.e. yield 2 to 4 per wafer) that have every feature you want, in ample precision. That's fine, and I'm sure there's a market. But if you have to design a system that scales from $49 to $500 retail, you do have to decide what is important and what is not, and target a balance that achieves that maximum performance and quality, while allowing for the lowest cost. This means studying the current and (near) future algorithms and finding their real bottlenecks and addressing those. If your source data is 8b and destination is 8b (FP16 next year), then I believe you'll find that 24b is generally fine.

My comments only appear to be conclusive because that's the way I view things. I might (and often) am wrong, but that doesn't mean I don't believe my own opinions. I speak only for myself.

FP32 will not be enough for games. Your comments appear otherwise, in a declaration-type statement. Unless, of course, you're talking about the necessity for the advancement in hardware (not video cards) on the PC platform before we start talking about advancement of video cards.

Of course, this could just be about priorities. I may decide to make a game where I have a personal conviction that anything less than FP32 just won't do for me. And then the DevRels come in and convince me otherwise. Because I need to sell games!

Right now, we have simple apps with reasonably simple pixel shaders. We've just broken into being able to do some very cool things. We need to advance the products in a balanced way, where no one aspect gets way ahead of the others (unless mandated). That's my belief. The current set of shaders are rather stable numerically, and the source data is rather low precision. The artifacts and problems are mainly due to data sets at this time. As that improves, then the limitations of FP24 will show up.

You can certainly make an app that requires 32b, but I'm pretty certain that it will have other limitations (i.e. not be real time) or that it could be written in a way to be fine and dandy for 24b. For the first case, that's not really today's target market, and I find it hard to justify the cost of designing to accomodate that market.

My thinking is getting muddle from cold meds. Later
 
Okay, everybody meet back here mid-2006 and we'll review if we're seeing any FP24 arties on our (DX9) X800s that aren't on our (DX9) R520s in then-current games.
 
I think what we have seen since the launch of DX9 is that full precision (as in FP32 being the highest DX9 precision available) is rarely required in everyday usage and that partial precision will generally suffice (unless there is a need for intensive, iterative calculations).

I would have a lot less trouble with ATi's stance if they had included both FP24 and FP32 into their GPU's, with the FP24 being _PP and FP32 being full precision.

ATi's stubborn insistence on only one precision has clouded the real issue, both for consumers and certain developers.

I know somebody will attack this post with nVidia playing up FP32, I'd encourage people to remember back to pre NV30 launch and a .pdf (can't remember the exact one, but I think there is a link to it in the nvnews archives) that features a boot rendered as FP16 and FP32. Even back then nVidia reccomended FP16 over FP32 for everyday rendering.
 
ATi's stubborn insistence on only one precision has clouded the real issue, both for consumers and certain developers.

Sorry? What proof of that is there? Developer have to do more work to support multiple precisions...

And what on earth would the point of supporting FP24 and FP32 in an architecture be?
 
We've already had the discussion on how much work is involved in supporting partial precision (a trained monkey could do it blindfolded).

If you are going to bring up the issue of art assets as a defense, then you first need to explain why ATi's Doom3 shader replacement doesn't degrade image quality and why the HL2 DX8 water shaders look so much better.

I thought we agreed FP24 was a strange (though valid) number where _PP is concerned, but, hey thats the precision ATi has chosen for their chips - what do you want me to do about it?
 
Back
Top