X800 using SM3.0 path in FarCry

Dont you mean the SM2.0B which can theoretically do the lighting model in SM 3.0 path?
 
Hmmm, I have no idea what it says (not sure why you are saying read the article, might as well throw in learn russian while you are at it).

But, the images for the x800 definately show something that says using PS3.0 only (must be ATI's special russian version ;))
 
Disclaimer: The large part of this article was written to the output
of the final version of patcha. Nearer toward the end of the article
possible it found additional information about changes in those
occurred between the version originally by us of that obtained and by
the version, which left officially.

After the appearance of the first part of the article many opinions
appeared. Among them was both enthusiastic and skeptical. The first
expected it saw FarCry, renderinga (so called sm30path) carried out in
the new regime on all videokartakh, which support piksel'nye sheydery
of the second version. However, the second spoke about the complete
impossibility of this. Now 4, already during indivisible experimenting
with different modifications FarCry 1.2, which work on videokartakh of
those supporting sheydery only of second version, I can he said that
are partly right they was and those, etc.

Without putting aside into the long box, 4 to podtverzhdayu that on
the hands is version FarCry 1.2, which is carried out in the so-called
regime "sm30path" (more precise only in its part of that corresponding
to piksel'nym sheyderam) on videokartakh of those supporting
piksel'nye sheydery 2..x and that put outting as a result the image,
identical obtained on NV40. further such sheydery we budem frequently
called "long".

Necessarily noted that duplicating geometry (GeometryInstancing) in
the case of our tests practically it does not influence eventual
result. This it is not meant that this function it is useless, but it
wakes it maximally appeared in the case when "narrow" place with
renderinge it is central processor not have timing it assigned the set
of the excess parameters for mapping of scene. Such with the standard
testing FarCry on NV40 it is not occurred, partly because FarCry
originally byl is optimized under existing videokarty. Subsequently
the situation can changed. For example, developers on videokartakh,
which support duplicating geometry (GeometryInstancing), smogut on the
free spaces it sketched the real geometry of vegetation at large
distances from the player, without substituting by its spraytami
(imposters).

Even with the preparation of the first part of the article it was
explained that theoretically in regime sm30path instead of sheyderov
of the third version possible it limited to the use of sheyderov of
the second version. Therefore was undertaken attempts it forced FarCry
it used the regime of renderinga, calculated for "long" sheydery of
the third version on videokartakh, which support only sheydery 2..x.

The total step-by-step list of the accomplished work appears
approximately thus:

FarCry was "it was deceived" it sufficed was used the regime of
renderinga, analogous to that carried out on NV40, with renderinge
instead of sheyderov of the third version of steel it was used
equivalent sheydery of the second version, was checked the identity of
the obtained images. The first stage is solved relatively simply: with
the aid of utility 3DAnalize already mentioned in the previous part of
the material was substituted combination VendorID/DeviceID
, which
identifies producer and concrete product, and also was advanced flags
that current videokarta it supports sheydery of the third version. As
a result of such actions FarCry neglected on the system with any
videokartoy begins it received it as NV40 it allows was used rendering
of several sources it dawned into one passage with the aid of "long"
sheyderov.

The second stage - substitution of sheyderov of the third version, to
analogous sheydery of the second version - for us it succeeded it
conducted partly using cursor itself FarCry as pomoshnika
. The fact is
that FarCry can itself perekompilirovat' its sheydery, written on
HLSL/Cg, and it carries out this, using external compiler from
Microsoft: fxc.exe. Therefore for achievement of our aimed this
compiler it was substituted with another program, which carries out
all actions, required from the original compiler, and by at the same
time contributing the neobkhodimye correctives into sheydery and
gathering on them statistics.

Because of the limited time for working and verification of data the given below results are based on the data of 4 levels, used in demkakh, given NVIDIA together with patchem FarCry 1.2, and alsoon the results of the girder of quite these demok: Research, Regulator, Training and Volcano. I will first give general statistics. In all for renderinga of four levels FarCry pointed out above prepare about 3500 piksel'nykh sheyderov and 3500 apical sheyderov of all accessible versions: from 1.1 to 3.0. The lion's share of sheyderov comprise the permutations of sheyderov 3.0 for the set of the different values of the initial parameters. All apical sheydery 3.0 without the problems are compiled into sheydery 2.0. As sheyderov 2.0 we obtain with the compilation of piksel'nykh sheyderov 3.0: 1700 sheyderov 2.0 and 1900 sheyderov 2..x (was used profayl ps_2_.b for chips ATI R420, but unconditionally all these sheydery can be compiled with profaylom ps_2_.a). Evidently that more than half of sheyderov impose requirements those exceeding base version 2.0. But the compilation of sheyderov, does not indicate their required use in the real game situation. For example, with the girder of demok NVIDIA it is used on one sheyderu of version 2..x at levels Regulator and Training. And not one sheydera 2..x at levels Research and Volcano. In this case a quantity of sheyderov of the base version of version 2.0, utilized in each of these demkok kolebletsya in the region of 70 pieces. Thus, we smoothly arrived at the third part of our study: the comparison of the identity of the obtained images. This part proved to be most complex, since after the initial fulfillment of two points mentioned above it turned out that skrinshoty, obtained with the use of "long" sheyderov frequently they do not coincide with those obtained with the use of sheyderov of the third version. Moreover it was not possible to say that they incorrect - simply somewhat others. Reason proved to be in the use by developers FarCry of one of the special features of the model of sheyderov of the third version - 10 interpolation registers, which store given with the floating point, and using for the transfer of the results of fulfilling apical sheydera to the the piksel'nyy. The specifications of sheyderov of y..kh-2.kh stipulated the presence only of 8 such registers of standard accuracy. Two remained registers are used for the transfer it is color, their accuracy only must be not less than 8 bits to the color channel (standard 32 bit color), and even the more crucial point: with transfer from apical sheydera to the the piksel'nyy the values of these "color" registers are cut in the range [ 0..1 ]. I.e. any value, recorded in apical sheydere and which exceeds 1 or smaller than 0 will be unavoidably distorted with the transfer to piksel'nyy sheyder. It is important to here note that this requirement of specification DirectX, so that even if videokarta, which supports piksel'nye sheydery of version 2..x can interpolate more than 8 registers with the high accuracy it it is obligated to emulate the behavior described above. Above I indicated that the developers used a special feature of sheyderov - the presence of 10 interpolation registers, but in the reality they use only 9 registers of 10 and that only with renderinge the illumination from four luminous sources into one passage (four sources - this is a maximum quantity,supported by cursor FarCry at the given moment).

This appears by asomewhat distinction constraint and when desired of the support of sheyderov of the second version developers FarCry could simply it
limited single-pass rendering 3 by sources it dawned for this class of videokart. But since it did not be neither possibility nor of desire guided the code of game itself it was necessary it searched for alternate routes
_________________
 
If I understand that bad babelfishing basically sounds like they took the SM3.0 shaders and recompiled them to fit SM2.0x profile which there apparently was no issues with most the SM3.0 shaders. Basically they showed that by simply recompiling the new shaders there is a speed boost for ATI.

This shows either CryTek is either ignorant of the fact the code they wrote would work fine recompiled for SM2.0 and better. Or, maybe shows they intentionally knew this or possibly just being lazy.

Really looks like its a well written article if we only knew Russian :p They tried to ensure that quality wasn't lost and everything but of course it had to be subjective since of course the different pathways don't result in the exact same images.
 
Cryect said:
If I understand that bad babelfishing basically sounds like they took the SM3.0 shaders and recompiled them to fit SM2.0x profile which there apparently was no issues with most the SM3.0 shaders. Basically they showed that by simply recompiling the new shaders there is a speed boost for ATI.

This shows either CryTek is either ignorant of the fact the code they wrote would work fine recompiled for SM2.0 and better. Or, maybe shows they intentionally knew this or possibly just being lazy.
Thanks, THAT I can understand. :)

Next question would be can we take advantage of this yet with the 1.2 patch or do we need to wait for some new drivers that are set-up for this? :|
 
I'm guessing A) they changed the hardware/vendor ID and then did what umm can't remeber who did but made a wrapper like the one for the ATI demo umm colourless I think wasn't it?
 
digitalwanderer said:
Cryect said:
If I understand that bad babelfishing basically sounds like they took the SM3.0 shaders and recompiled them to fit SM2.0x profile which there apparently was no issues with most the SM3.0 shaders. Basically they showed that by simply recompiling the new shaders there is a speed boost for ATI.

This shows either CryTek is either ignorant of the fact the code they wrote would work fine recompiled for SM2.0 and better. Or, maybe shows they intentionally knew this or possibly just being lazy.
Thanks, THAT I can understand. :)

Next question would be can we take advantage of this yet with the 1.2 patch or do we need to wait for some new drivers that are set-up for this? :|

i think you can take advantage of some of the new 20b shaders included with the 1.2 patch. For some of the features you need new drivers(geomtry instancing)

that's what i get with 1.2 patch d9c x800pro(cat4.7)

HDR Rendering: MRT
MRT Rendering: Disabled
Occlusion queries: Supported
Geometry instancing: Not supported
Gamma control: Hardware
Vertex Shaders version 2.0
Pixel Shaders version 2.0
Use Hardware Shaders for ATI R420 GPU
Pixel shaders usage: PS.2.B, PS.2.0 and PS.1.1

Vertex shaders usage: VS.2.0 and VS.1.1
Shadow maps type: Mixed Depth/2D maps
Stencil shadows type: Two sided
Lighting quality: Highest
 
I've already posted some extract for article here: http://www.beyond3d.com/forum/viewtopic.php?p=329388#329388.

Basically steps are:
1) fool FarCry to think it's running on NV30 to enable sm30path
2) fool FarCry when it re-compiles it's sm30 shaders and instead return sm2x shaders to it.

On official 1.2 patch R420 owners can turn on "\r_sm2bpath 1" (or reenable "\r_nops2b 0" shaders) and get comparable increases to NV40 sm30 path. This is shown in last diagram in article. The other bold line is that most often used "ps30" shaders actually can be compiled as a pure ps_2_0 ones. And if we try to use "ps2b" shaders on ps20 hardware number of compatible shaders will increase (as in 2b there is no need to calculate fog - it's done in fixed function).
Surely on current drivers as it was pointed by DaveBaumann you'll get no GeometryInstancing, but current increase is already worth.
 
Back
Top