X800 using SM3.0 path in FarCry

But dynamic branching can be performed on R3xx and makes a huge increase in performance. Check Humus demo...
 
Trying to run the new shaders on x800 probably causes some major graphical glitches, missing graphics, or some other anomoly ixbt didn't detect... If a Russian website could do it, obviously CryTek could have done it too.
 
Ruined said:
Trying to run the new shaders on x800 probably causes some major graphical glitches, missing graphics, or some other anomoly ixbt didn't detect... If a Russian website could do it, obviously CryTek could have done it too.

just like forcing PS 2.0 in FC demo on nv3x?
when ppl had to change Vendor ID?.....people _could_ do it, but that doesnt mean Crytek(and certain company which invested in Crytek via certain marketing scheme) _wanted_ that.
 
silence said:
just like forcing PS 2.0 in FC demo on nv3x?
when ppl had to change Vendor ID?.....people _could_ do it, but that doesnt mean Crytek(and certain company which invested in Crytek via certain marketing scheme) _wanted_ that.

Uh oh, conspiracy theory ;)

Seriously, if it was a "marketing thing," 3dc would not be in FarCry.
 
Ragemare said:
Ruined said:
Seriously, if it was a "marketing thing," 3dc would not be in FarCry.

Thats what they want you to think.

The truth is out there !1

BTW, Dave, where did you hear 3dc was cancelled, in the last Crytek interview they said 3dc would be enabled in 1.3? Plus, it looks like some prelim 3dc stuff made it into 1.2
 
digitalwanderer said:
CJ said:
digitalwanderer said:
Are you talking on the R3xx?!?!? :oops:

That would be on the R420. Not R3xx. They're talking about SM2.0b and R3xx doesn't support that.
Oh yeah, btw, you're entirely wrong. 8)

you can probably run sm20b shaders on r3xx hardware(if the drivers allow it) as long they don't exceed the limtation of the hardware but for instance if you have a sm20b shader which is longer than 96 instruction the r3xx won't be able to run this shader in a single pass which would also mean sm20b is not fully supported
 
RejZoR said:
But dynamic branching can be performed on R3xx and makes a huge increase in performance. Check Humus demo...

So that means it's not an SM3.0-exclusive feature, after all?....;) I wish people would be a bit more discerning. To say that "dynamic branching makes a huge increase in performance" is the same as to say that "pixel shading makes a huge increase in performance," or "multi-texturing makes a huge increase in performance," or any number of generic terms you might similarly describe.

These things are only true if the hardware implementation is good, the drivers are good in supporting that hardware, and the software engine does a good job in supporting the feature. As well, it's always comparative, too, as in one man's "dynamic branching" program is another man's "dynamic branching" nightmare...;) There are just a lot a variables that go into determining when a feature is nice in one implementation and when it stinks in another, and it is anything but cut & dried.
 
tEd said:
you can probably run sm20b shaders on r3xx hardware(if the drivers allow it)
Not yet, but soon...go ahead and quote me on that. ;)

as long they don't exceed the limtation of the hardware but for instance if you have a sm20b shader which is longer than 96 instruction the r3xx won't be able to run this shader in a single pass which would also mean sm20b is not fully supported
Uhm, did anyone else just hear a loud "WHOOOOOOSH!"-ing noise go overhead? :|
 
WaltC said:
RejZoR said:
But dynamic branching can be performed on R3xx and makes a huge increase in performance. Check Humus demo...

So that means it's not an SM3.0-exclusive feature, after all?....;) I wish people would be a bit more discerning

I also wish people would be more discerning and stop calling Humus' demo an example of dynamic branching. Its an alternative to using dynamic branching that reduce to accept-reject stages. You can use multiple passes to emulate multiple branch-paths but the overhead of the stencil-rejection trick starts to build quickly as you increase the number of passes. I'd say that stencil tests closer resemble CMOV and other predicated instructions than they resemble dynamic branching. The destination 'address' is arbitrary when using dynamic branching. When using stencil-rejection, the 'address' is static -- you always branch to the shader exit point if the stencil reject condition is met.
 
Whether to call it dynamic branching or not is open for debate, but since it's semantically the same thing, I think it's valid to call it that. I don't see however how you can claim it to be comparable to CMOV. In fact, it moves away from such a model to a model where only one path is executed.
 
http://www.driverheaven.net/#article_15593

Farcry 1.2 - the withdrawn patch
Posted on Friday, July 23, 2004
at 11:49 PM by Zardon - 31 Comments


Lets face it, Its been a hell of a rough month for Crytek, not only was their much vaunted 1.2 patch severely delayed, they finally released the patch only to withdraw it today. We had planned a look at this patch on ATI hardware with new beta drivers but unfortunately it seems rather pointless presenting an indepth article detailing frames per second differences at various resolutions when the patch has been removed from public consumption. Nonetheless we have seen this is a hot topic of conversation on our forums, so Stuart and myself feel we should post something detailing a little of the apparent mysteries involved with this patch and forthcoming ATI drivers.

This is the official statement today on the withdrawal of Patch 1.2, "Far Cry patch 1.2 has shown unexpected behaviour on specific hardware configurations. These matters are mainly due to incompatibilities with several optimisations brought lately to the code, with the intent to please a large number of users.
We're currently asking CRYTEK to work on delivering a new patch as soon as possible. Until then we have decided to remove the patch 1.2 from the official UbiSoft websites."

We will be very interested to see what will have changed when it reappears, remember the benchmark disappearing from Tomb Raider: Angel of Darkness?

A few weeks ago we received an advance copy of FarCry 1.2 from Nvidia. This build of 1.2 was sent out as it added Shader Model 3.0 support to enhance performance (but not image quality at this time) on the 6800 series graphics cards and we detailed the performance impact of the changes in a previous article.

Lets have a look at the patch that "never quite was". To get the full benefits of the patch you would have required an X800\X600 or X300 class graphics card, DirectX9.0b, FarCry 1.2 and Catalyst build 8.041 or above. If you have these you’ll get the full feature set: SM 2.0b and Geometry Instancing

The 2nd option is to have a R3xx or above graphics card (9500,9600,9700,9800) DirectX 9.0b, FarCry 1.2 and Catalyst 8.041. With this setup Geometry Instancing support is possible.

Thats right people, the NV40/SM3.0 isnt the only card/Shader Model that can provide instancing, any of ATI’s DX9 hardware supports this feature, in the words of some ATI employees (off the record) "even using DirectX9.0b"

So what does this mean?

Instancing:
Lets look at the feature common to all ATI cards first. Geometry Instancing (/Vertex Instancing) allows for more detailed graphics. A specific example in Farcry is that distant vegetation is no longer rendered as sprites, the render changes to a more detailed animated vegetation. (Effectively the objects like grass or tree’s in the distance can now be animated like the vegetation near your character rather than be static 2d sprites). Overall performance will be lower by removing the usage of 2D sprites and increasing the geometry level. Geometry Instancing will make this higher detail option a better option for RADEON X800 users.

To see the real benefits of instancing, you can manipulate the amount of geometry on the screen with Far Cry’s e_vegetation_sprites_distance_ratio parameter. This engine setting controls the point at which truly rendered (using geometry) vegetation is replaced by 2D sprites (billboards). The default value is “1â€. Substituting a higher value pushes the threshold at which this substitution occurs to a distance further from the viewpoint.

Type the following in the command console: \e_vegetation_sprites_distance_ratio 100

This will eliminate the usage of 2D sprites as vegetation, and draw full animated vegetation throughout the scene



Shader Model 2.0b path:
The main benefit of the Shader Model 2.0b path in FarCry is that you have much improved one pass lighting support. As this improves the performance of indoor areas (due to heavy use of lighting) the SM 2.0b path gives benefit in many levels within FarCry. Additionally the 2.0b path gives added support for other complex shaders throughout the game, again boosting performance.

It should also be noted that the support for the above features is not unique to FarCry, any game developer using DX9 will be able to use the above features on the ATI Radeon.

We have been playing with these new features today and its relatively simple to enable the above features.

Firstly open the Farcry Console using the ` key.
For SM2.0b path type (without quotes) “\r_sm2bpath 1†and hit return.
For Instancing support type (again without quotes) “\r_GeomInstancing 1â€
Of course if your enabling Instancing you’re going to want to bump up the vegetation detail, to do this you need to change your Farcry settings from (e_vegetation_sprites_distance_ratio,1.000000) to (e_vegetation_sprites_distance_ratio,100.0000). This change removes spite vegetation and replaces it with animated vegetation throughout the scene.

So to recap:

The major performance increase is due to improvements in the lighting shader. The new patch enables the effect of three lights to be calculated in one pass (previously, each light required its own pass). It is possible, with a bit of extra work, for four lights to be calculated in a single pass on ATI hardware, but Crytek has not made the changes that would enable this. This is the major performance increase on both ATI and Nvidia hardware.

The second performance increase is through the use of instancing. Instancing is a technique by which large numbers of identical objects can be grouped together and processed as a batch, greatly reducing the load on the CPU. Uses in FarCry include trees and grass. In the original version of the game, only trees and clumps of grass close to the player were rendered as geometry (that is, as a collection of triangles with textures painted across them). Further from the viewer they were replaced by "billboards" - a flat image of a tree that always faces the viewer. This increases frame rates, but at some cost to realism. Instancing allows the distance at which geometry must be replaced by billboards to be increased significantly (to the extent that essentially nothing the viewer sees is a billboard) with only a very small hit to performance.

SM3.0 includes instancing, a feature nvidia have promoted heavily, saying that it proves the superiority of their hardware - while this is certainly a major talking and selling point, all X800 cards (and even 9500,9600,9700,9800) are capable of instancing. We have a beta driver in our possession which reflects this and associated performance gains, and it will be in the publicly posted CATALYST 4.8. We dont feel until a final patch is available from Crytek we should post our detailed findings as there is every possibility between now and the next release this could change.

We can give you rough indications of percentage increases - on the research map with Far Cry v1.2 \r_sm2bpath 1 and HLSL compiler profile 2.0b we saw increases of 13% at 1024x768, 21% at 1280x1024 and 25% at 1600x1200. Nothing to be sniffed at.

So what about the juicy gossip? Well to say today has been a dramatic day is an understatement, during our testing period for the last 12 hours or so we have spent alot of time on the phone and in email to many people in the industry, one such person in the industry even stated this:

"Nvidia encouraged Crytek to use partial precision in the shaders wherever possible. This means that many of the shaders will run in 16-bit precision, not the 32-bit precision (a requirement of SM3.0) that they are touting to the press. Ironically, while promoting the 32-bit precision of SM3.0 as a "must have", Nvidia is asking developers to use less precision than was available in SM1.0 - that is, all the way back in DirectX8! Even more ironically, ATI hardware will run most of the FarCry shaders more accurately (ATI hardware runs all shaders in 24-bit precision). Microsoft, the owner of DirectX, defines 24-bit and greater as "full precision" and 16-bit as "partial precision", Nvidia has claimed that ATI paid Crytek to delay the patch and include ATI features (the figure mentioned was $500k!)."

So there you have it, some of the facts, and some rumours.... nonetheless Crytek has committed to delivering 3Dc support in patch v1.3, which will provide higher detail images, or free up memory with better compression than traditional normal map compression options. FarCry is a great example of a next generation title that uses normal maps extensively. When the "final" patch arrives, we will be giving it a good going over and posting indepth results.
 
Partial precision is not bad. In Far Cry there's no difference between FP16 and FP24. I will admit, if there's a difference in IQ in future games then FP24 over FP16 will make more of a difference than FP32 over FP24.
 
Back
Top