If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Registered
Join Date: Jul 2010
Posts: 2
|
Hello,
I'd like to state before all that I'm quite a newbie in GPU technology and may be overlooking lots of stuff, using the wrong vocabulary or mistaking the purpose of this forum. Sorry in advance and don't hesitate to correct me I'm now writing vertex programs in CG for a RSX (PS3) profile, although I believe my question is not strictly PS3 related. My VP has: out float4 out_pos : POSITION0 And in my context I have 3 float2 values named sr2, sg2, sb2. I was surprised to find that when passing the shader thru NvShaderPerf those two blocks resulted in a difference in the output: // <some code> #if 1 // Total shader is 36 cycles out_pos.xy += sr2; out_pos.xy += sg2; out_pos.xy += sb2; #else // Total shader is 33 cycles out_pos.xy += sr2 + sg2 + sb2; #endif // <some code> Now I know the shader is globally optimized, but regardless of other contents in the code, I cannot tell why those two blocks are any different in the first place. I thought that shader compilers optimized quite agressively and would naturally produce the same output for this case. Can you explain it or direct me to information to understand this phenomena and perhaps how shader units works more in details? Thank you. |
|
|
|
|
|
#2 |
|
Monochrome wench
|
The reason for the difference is due to floating point maths, the two statements are not actually equivilant. If out_pos.xy is a big number and the other values are small there may be a difference in the final result depending on which code is used.
|
|
|
|
|
|
#3 |
|
Registered
Join Date: Jul 2010
Posts: 2
|
I think I had the misguided impression that CG compilers were more lax regarding floating points accuracy artefacts, than say, C++ compilers, but now I that I think about it there is no reason for that to be.
Thanks for your answer! (Note: I tried CG compiler parameters such as --fastmath or --fastprecision but it didn't change anything in that specific case.) |
|
|
|
|
|
#4 | |
|
Member
Join Date: May 2002
Location: Santa Clara
Posts: 578
|
Quote:
Vertex position is bound to be very twitchy when it comes to small differences in calculations potentially generating large differences in the output image. Surfaces that are co-planar or close to co-planar will very quickly start to show very visible problems if the math operations generating their locations are not identical. I think that generally compilers under these circumstances would attempt to guarantee that these errors are as far as possible solely the fault of the developer rather than the compiler, regardless of which optimization mode you select... |
|
|
|
|
![]() |
| Bookmarks |
| Tags |
| compiler, nvshaderperf, optimization, rsx, vertex |
| Thread Tools | |
| Display Modes | |
|
|