If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Member
Join Date: Jan 2010
Posts: 375
|
I disassembed a bunch of vs/ps and am going through it and found quite some nonsensical code-fragments like this:
rsq r1.w, r5.w // 1/sqrt(), r5.w is saturated [0,1] rcp r1.w, r1.w // 1/1/sqrt(), sqrt() add r2.w, -r1.w, c0.y cmp r1.w, -r1.w, c0.y, r2.w // -r1.w >= 0 ? ... : ..., c0.y is 1 This is: (-sqrt(r5.w) >= 0.0 ? (1 - sqrt(r5.w)) : 1) (sqrt(r5.w) <= 0.0 ? (1 - sqrt(r5.w)) : 1) Make no sense. sqrt can't yield negative results, besides the input being positive. Breaks down to: (sqrt(r5.w) == 0.0 ? (1 - 0.0) : 1) Whut? Some clouded way to set a register to 1? Did I oversee something? --------- Got some more (checks for == 0 clearly): ((r0.w * r0.w) <= 0.0 ? 1.0 : 0.0) (-r3.x >= r3.x ? 1.0 : 0.0) Weirdos: max(-r6.w, r6.w) // which is abs(r6.w), the compiler didn't know _abs? (IN.texcoord_0.x <= 0.0 ? (1 - IN.texcoord_0.x) : (IN.texcoord_0.x + 1)) // which is 1 + abs(IN.texcoord_0.x), the compiler didn't know _abs? I can't believe breaking _abs into three scheduleable ops is possibly more efficient. Last edited by Ethatron; 13-Apr-2011 at 17:47. |
|
|
|
|
|
#2 |
|
Member
Join Date: Jan 2010
Posts: 375
|
Uhm, no comment? Maybe a "Hey that's how it is, no need to search for an error, where there is none." ...
|
|
|
|
|
|
#3 | |
|
Member
Join Date: Jul 2010
Location: Land of Mu
Posts: 350
|
If r5.w was 0, you'd have division by 0 in the first line.
Quote:
|
|
|
|
|
|
|
#4 |
|
Member
Join Date: Apr 2007
Location: Australia
Posts: 645
|
im guessing these are oblivion shaders
|
|
|
|
|
|
#5 |
|
Member
Join Date: May 2002
Location: Slovenia
Posts: 420
|
I did some reverse reverse engineering just for fun.
DX June 2010 SDK always figures out that max(-a, a) should translate to abs(a). rsq r1.w, r5.w rcp r1.w, r1.w add r2.w, -r1.w, c0.y cmp r1.w, -r1.w, c0.y, r2.w What this does (with r5.w in [0, 1] and c0.y = 1) is that it flips your [0, 1] interval around to [1, 0] so you can't really lose that add. The cmp then makes sure r5.w == 1 that 1 is returned. Basically for x from [0, 1) you end up with f(x) = (1-sqrt(x)) and for x == 1 you use f(x) = 1. (IN.texcoord_0.x <= 0.0 ? (1 - IN.texcoord_0.x) : (IN.texcoord_0.x + 1)) This one is a strange one though. You are right, this is just abs(IN.texcoord_0.x) + 1. Obviously developer told the compiler to do that. I find it really hard to belive that a compiler would figure out such an "optimisation" on it's own. But I don't know why it wouldn't take something like this out. |
|
|
|
|
|
#6 |
|
Member
Join Date: Jan 2010
Posts: 375
|
Thanks guys for the interest! I finally managed to get that I messed up the ternaries:
cmp r1.w, -r1.w, c0.y, r2.w This is: -r1.w >= 0 ? c0.y : r2.w Somehow I misread the MS Shader Assembly Refrence (fuck you MS, your documentation is soooo bad, can't you just ffffuuu give readable equivalent expressions as AMD for x86 does?). --------------- So (IN.texcoord_0.x <= 0.0 ? (1 - IN.texcoord_0.x) : (IN.texcoord_0.x + 1)) turns (IN.texcoord_0.x <= 0.0 ? (IN.texcoord_0.x + 1) : (1 - IN.texcoord_0.x)) which is 1- abs(IN.texcoord_0.x) There are huuge amounts of colapsable abs-cases in the assembly, on one shader this reduction freed 5 arithmetic instructions (of 70). --------------- And (sqrt(r5.w) <= 0.0 ? (1 - sqrt(r5.w)) : 1) turns (sqrt(r5.w) <= 0.0 ? 1 : (1 - sqrt(r5.w))) Makes total sense. --------------- @itsmydamnation: Hehe, yap I got them all HLSLified now, 650 of them and they are really readable; I had to program a little asm->HLSL reconstructor which does op-reordering and contraction, as well as optimization. Hope you like the awesome water as well. Anyone interested in the shaders can look here: http://codaset.com/ethatron/oblivion...ree/PseudoHLSL Last edited by Ethatron; 15-Apr-2011 at 11:44. Reason: fuuuuuu |
|
|
|
|
|
#7 |
|
Member
Join Date: Apr 2007
Location: Australia
Posts: 645
|
so does that mean for example Tomrek water shader could some how either directly replace the exisiting shader, or somehow hook into it if you have been able to decode them all.
im guessing if that could be done if would fix the water height issues etc? |
|
|
|
|
|
#8 |
|
Member
Join Date: Jan 2010
Posts: 375
|
The plan is to substitute the built-in shaders, they are perfectly embedded into their renderpass-groups.
Currently all those modifications (DoF, Bokeh, all of them) were deferred, post-process on the screen-rectangle; you can only go so far with that. The reason was simply there was no other way. Now as we've ripped open the engine with bare organs visible, one can expect a little less brutal modifications. So with the shaders being decrypted water fe. will be as before (with waves) and not really (fresnel, foam, and what not). Last edited by Ethatron; 17-Apr-2011 at 13:58. Reason: inglish |
|
|
|
|
|
#9 |
|
Member
Join Date: Apr 2007
Location: Australia
Posts: 645
|
looking forward to it, the question is do i need to go Xfire and a higher clocking+IPC CPU to get playalble oblivion frame rate, right now with REAVWD enabled im already at that point
|
|
|
|
|
|
#10 | |
|
Member
Join Date: Jan 2010
Posts: 375
|
Quote:
Of course there is profiling to be done for some real real numbers, but I don't expect the GPU saturation really soon, not like with the post-effects. Stacking them can easily lead to 10-15 additional "passes" on the screen-rectangle. But that's more because they are seperate for flexibility and low complexity not because it really has to be that way. Quite a few of them also do wide-tap filtering, maybe it's more of a bandwidth problem than an arithmetic problem. The post-effects are not where I put my nose into. I'm more interested in delivering the posibility to work on the shaders of the core-pipeline. |
|
|
|
|
|
|
#11 |
|
Member
Join Date: Apr 2007
Location: Australia
Posts: 645
|
so which current post process effects do you think you will bring into the pipeline? godrays, water ? does this help shademe with his shadowing system
|
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|