Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 14-Jul-2010, 09:12   #1
stereosoba
Registered
 
Join Date: Jul 2010
Posts: 2
Default CG compiler optimizer and commutative addition

Hello,

I'd like to state before all that I'm quite a newbie in GPU technology and may be overlooking lots of stuff, using the wrong vocabulary or mistaking the purpose of this forum. Sorry in advance and don't hesitate to correct me

I'm now writing vertex programs in CG for a RSX (PS3) profile, although I believe my question is not strictly PS3 related.

My VP has:
out float4 out_pos : POSITION0

And in my context I have 3 float2 values named sr2, sg2, sb2. I was surprised to find that when passing the shader thru NvShaderPerf those two blocks resulted in a difference in the output:

// <some code>
#if 1
// Total shader is 36 cycles
out_pos.xy += sr2;
out_pos.xy += sg2;
out_pos.xy += sb2;
#else
// Total shader is 33 cycles
out_pos.xy += sr2 + sg2 + sb2;
#endif
// <some code>

Now I know the shader is globally optimized, but regardless of other contents in the code, I cannot tell why those two blocks are any different in the first place. I thought that shader compilers optimized quite agressively and would naturally produce the same output for this case.

Can you explain it or direct me to information to understand this phenomena and perhaps how shader units works more in details?

Thank you.
stereosoba is offline   Reply With Quote
Old 14-Jul-2010, 14:08   #2
Colourless
Monochrome wench
 
Join Date: Feb 2002
Location: Somewhere in outback South Australia
Posts: 1,132
Send a message via ICQ to Colourless Send a message via MSN to Colourless
Default

The reason for the difference is due to floating point maths, the two statements are not actually equivilant. If out_pos.xy is a big number and the other values are small there may be a difference in the final result depending on which code is used.
__________________
-Colourless

D3D FSAA Viewer 5.4
Words by Cat - Truely Intelligent Viewing
Colourless is offline   Reply With Quote
Old 14-Jul-2010, 16:13   #3
stereosoba
Registered
 
Join Date: Jul 2010
Posts: 2
Default

I think I had the misguided impression that CG compilers were more lax regarding floating points accuracy artefacts, than say, C++ compilers, but now I that I think about it there is no reason for that to be.

Thanks for your answer!

(Note: I tried CG compiler parameters such as --fastmath or --fastprecision but it didn't change anything in that specific case.)
stereosoba is offline   Reply With Quote
Old 14-Jul-2010, 17:18   #4
andypski
Member
 
Join Date: May 2002
Location: Santa Clara
Posts: 578
Default

Quote:
Originally Posted by stereosoba View Post
I think I had the misguided impression that CG compilers were more lax regarding floating points accuracy artefacts, than say, C++ compilers, but now I that I think about it there is no reason for that to be.

Thanks for your answer!

(Note: I tried CG compiler parameters such as --fastmath or --fastprecision but it didn't change anything in that specific case.)
I think that you can generally expect that any shader compiler is likely to be very conservative about floating point optimizations when applied to calculations that will directly contribute to vertex position.

Vertex position is bound to be very twitchy when it comes to small differences in calculations potentially generating large differences in the output image. Surfaces that are co-planar or close to co-planar will very quickly start to show very visible problems if the math operations generating their locations are not identical. I think that generally compilers under these circumstances would attempt to guarantee that these errors are as far as possible solely the fault of the developer rather than the compiler, regardless of which optimization mode you select...
andypski is offline   Reply With Quote

Reply

Bookmarks

Tags
compiler, nvshaderperf, optimization, rsx, vertex

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:53.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.