Fetch4 - Important?

boltneck

Newcomer
How important is Fetch-4?

I just ordered a X1800XT.

Its going to arrive tomorrow. i just went from giddy with excitement to having doubts and feeling like I was just suckered into buying an Edsel. Is Fetch-4 really going to be more used by developers than SM3 processing like "Dynamic branching". How can this be SM3 "Don’t right" if you don’t have a major feature like this?

I have never even heard of "fetch-4" until today. How are you supposed to know what to buy? No one tells you that you should even need a feature like this, or even care about a feature like this until its to late and you find out that your expensive new hardware that has "SM3 done right" is totally screwed.

I thought that we were at FP32, we have HDR+AA, We have awesome SM3 processing with Dynamic Branching tings look great for now. How can Fetch-4 (whatever the heck that even is) come out of seemingly no where to be the “be all end allâ€￾ of what you must have on your hardware?

This just seems totally insane to me. Someone needs to publish a 3D hardware buyers forecast every 6 months with all this kind of information in it. What else is the X1800 missing that I am going to find crucially important in a couple more days?
 
Fetch4's getting a high profile now, because of 3DMk06's use of this, non-DX9, feature on ATI GPUs that support it (RV515 and RV530).

So, the question is, is Fetch4 going to be relevant in the short- to medium-term (6-18 months)? Are there any games that are likely to use Fetch4, as an alternative to hardware-supported percentage closer filtering (which is the technique NVidia cards use - am I correct?)?

Will R520's lack of Fetch4 create a frustrating eye-candy gap for those who've bought one?

Or is Fetch4 headed the way of 3Dc?

Are there uses for Fetch4 beyond its role supporting an alternative to hardware-PCF?

And just for entertainment value, look what the dog brought in:

http://www.ati.com/developer/SIGGRAPH05/ShadingCourse_ATI.pdf

Page 17 onwards, though the whole thing is worth a quick glance.

Jawed
 
I'm looking at Fetch4 right now the same way I was looking at SM3.0 a while ago.

It'd be nice to have it, but I don't see it as needed by anything yet or on the horizon so it's not a deal-breaker for me either way.

Then again, I'm not really sure what it is either yet.... :oops:
 
So Fetch-4 is a non DX9 Feature?

Then Why is it being used by futuremark in the first place? What’s the point of DX9 standards if you are going to apply stuff willy nilly like this?

Or is this a feature that is just a little ahead of it’s time and will be in the next release.

Also, Why does all the rest of the X1000 line have this, and the R520 does not? Has Nvidia supported this since before the 7800? If so, why is ATi developing a top of the line Chip without it in the first place?

How big of an impact can not having Fetch-4 have on you in future games?
 
It's very useful for shadow mapping pcf filtering algorithm as it reduces the the texture fetch for instance for 4x4 pcf filter kernel from 16 to 4.

Seeing that shadow mapping will make more and more appearance in games i would consider it a important feature.

Things like fetch4,DB,and DST can quite improve performance for shadow mapping especially if used together.

The other question is of course if it will be used by developers.
 
Last edited:
dumb question: Could somebody extend the abbreviation "PCF"? I'm not sure with it's exact meaning. Thanks. :oops:
 
A note from the Radeon X1x00 Programming Guide:

Fetch-4 works not only on DST (depth-stencil textures) formats (like DF24), but with all other single-channel formats as well; therefore, this feature can be used for implementing various filter kernels that operate on single-channel textures.

This document only seems to be available as part of the SDK download:

http://www.ati.com/developer/radeonSDK.html

which is rather hefty just for one document :cry:

Section 3.2.2 Depth Textures is interesting, too:

Recent drivers have added a special depth texture format that allows sampling 16-bit depth buffer information as a texture on all ATI DirectX® 9 video cards. This is especially useful for implementing shadow maps and other techniques that rely on the scene’s depth information. Previously, applications had to rely on depth values output from the pixel shader to a high-precision render target, sometimes using a separate depth rendering pass. This new format allows an application to bind a depth texture surface as a depth buffer and later re-use its contents as a texture without extra rendering overhead.

While 16-bit depth textures are very useful, some algorithm implementations might find the 16-bit precision insufficient. To solve this problem the Radeon X1600 and Radeon X1300 added a new, more precise 24-bit depth texture format. Both 16-bit and 24-bit formats are implemented as Four-CC codes and application should query their support before trying to use them. An application can create a depth texture with one of the available formats and set it both as a depth buffer and a texture. It is prohibited to simultaneously render to the depth texture and fetch from it as a texture. Because rendering with depth textures is generally somewhat slower than with a normal depth buffer, they should not be used as replacement for the primary depth buffer.

When rendering shadow maps, only the depth information is relevant and scene color information can be disregarded. To save fill rate you should disable color output using a color write mask. But even with a color write mask disabling the color output, a color buffer of matching multisample type to depth texture ([font=Courier New,Courier New]D3DMULTISAMPLE_NONE[/font]) should still be created and bound to the D3D device. For large shadow maps this color buffer could waste a lot of space if it is created with the same dimensions as the depth buffer. On ATI DirectX® 9 video cards it is safe to set smaller size color buffer than the depth buffer. For ultimate space saving you can allocate buffer as small as 1x1 pixels using [font=Courier New,Courier New]D3DFMT_A8B8G8R8 [/font]surface format.

Jawed
 
Last edited by a moderator:
tEd said:
It's very useful for shadow mapping pcf filtering algorithm as it reduces the the texture fetch for instance for 4x4 pcf filter kernel from 16 to 4.

Seeing that shadow mapping will make more and more appearance in games i would consider it a important feature.

Things like fetch4,DB,and DST can quite improve performance for shadow mapping especially if used together.

The other question is of course if it will be used by developers.

Interesting.

This really makes me wonder how you can trust that the hardware you are spending hundreds of dollars on is properly equipped. There seems to be no way to tell until it is to late.

Shadows are obviously one of the things that developers are going to use more and more to add realism. I am at a total loss how this could be missing from a piece of hardware that costs hundreds of dollars and came out a couple months ago.

I almost feel like ATi should recall all these and replace them with the 1900. How can you do this to people? We are not talking about something in some far off spec, its in everything else but this.

Its crazy.
 
I am sure there is a good reason fetch4 not being supported by R520.
My first guess, that it is irrelevant, because in R520' s case, an alternative software solution is just as fast. Of course I'm just guessing and being optimistic.

To boltneck, I asked the same in a pm towards Jawed, he said there is no futureproof solution in this industry, and I have to agree with him. But I hope the guys at Ati know what they are doing ... and X1900 XT will be expensive for a few months in order to leave room for X1800XT. I've decided not to cancel my order, because I can't wait, and a 256 Mb 7800 GTX is just not an option. X1800 XT is a great card, with 512 Megs of RAM, the GTX 512 is expensive, so ...

Anyway, I guess it's up to the devs now.
 
Last edited by a moderator:
Of course no card is future proof. It sure seems like this is an obvious today feature that "Everyone else has" and not a tomorrow feature.
 
Fetch4 and 24-bit depth textures were added simultaneously to RV515 and RV530.

24-bits is a precision improvement in place of R520's (and all older DX9 Radeon cards') 16-bit support.

Fetch4 is a performance improvement over the conventional technique of point-sampling a 16-bit texture in order to perform PCF. Fetch4, allied with the correct 1-channel texture formats in Radeons, means that these texture formats not only take less space than they otherwise would, but that they also take less time and bandwidth to sample.

So R520 (and all previous Radeons) are not prevented from using DST with PCF for shadow mapping, it's simply that it will have lower precision and work more slowly than with the newest ATI GPUs.

Hope that's the correct conclusion!

Jawed
 
Hubert said:
I am sure there is a good reason fetch4 not being supported by R520.
My first guess, that it is irrelevant, because in R520' s case, an alternative software solution is just as fast. Of course I'm just guessing and being optimistic.

Fetch4 is primarily there as an alternative to NVIDIA's PCF. NVIDIA have had PCF in hardware for many years and numerous titles make use of it - all ATI hardware prior to RV515/RV530 haven't had the capability, meaning they have always had to do something different (whether is be dropping soft shadowing entirely, such as in the slightly quick/dirty Splinter Cell XBOX port, or pay the sampling penalties in the texture unit and generating the shadow percentage in the shader).

As to why R520 doesn't support it is beyond me - it may be the case that the texture units were only changed after the design was set (RV515 and RV530 would have started after R520) or the samplers were a little borked for this operation so it has been disabled, to be sorted on subsequent parts. However, this operation is primarily there to speed up the sampling rate of single format textures (i.e. increase it 4 fold), and its actually less likely that R520 would be bottlenecked by its texture sampling rate than other areas in comparison to a part such as RV530 (with its relaltive texture to shader ratio) ;)
 
no-X said:
dumb question: Could somebody extend the abbreviation "PCF"? I'm not sure with it's exact meaning. Thanks. :oops:

Percentage Closer Filtering. Just read Jawed's link, namley the SIGGRAPHS ShadingCourse pdf, page 17.

Thanks to Mr Baumann, and Jawed for their posts.
 
Last edited by a moderator:
Dave Baumann said:
Fetch4 is primarily there as an alternative to NVIDIA's PCF. NVIDIA have had PCF in hardware for many years and numerous titles make use of it - all ATI hardware prior to RV515/RV530 haven't had the capability, meaning they have always had to do something different (whether is be dropping soft shadowing entirely, such as in the slightly quick/dirty Splinter Cell XBOX port, or pay the sampling penalties in the texture unit and generating the shadow percentage in the shader).

As to why R520 doesn't support it is beyond me - it may be the case that the texture units were only changed after the design was set (RV515 and RV530 would have started after R520) or the samplers were a little borked for this operation so it has been disabled, to be sorted on subsequent parts. However, this operation is primarily there to speed up the sampling rate of single format textures (i.e. increase it 4 fold), and its actually less likely that R520 would be bottlenecked by its texture sampling rate than other areas in comparison to a part such as RV530 (with its relaltive texture to shader ratio) ;)

I can think of a lot of games that use soft shadows and a lot of games that will likely use them in the future.

How big of an impact on final performance are we talking about? 3 FPS or 10 FPS? Does FEAR use this technique and what differences does it show between Nvidia and ATi?

It seems that if your whole game is using soft shadows then 4X performance in this specific area is going to make an astounding difference.
 
Dave Baumann said:
Fetch4 is primarily there as an alternative to NVIDIA's PCF. NVIDIA have had PCF in hardware for many years and numerous titles make use of it - all ATI hardware prior to RV515/RV530 haven't had the capability, meaning they have always had to do something different (whether is be dropping soft shadowing entirely, such as in the slightly quick/dirty Splinter Cell XBOX port, or pay the sampling penalties in the texture unit and generating the shadow percentage in the shader).

So why all the flak about support for hardware-accelerated DST/PCF in 3dmark if it is so widely used in gaming applications? Also, does anyone have a link to a good Fetch4 explanation?
 
B3D Article said:
Both RV530 and RV515 feature one capability that is not present on R520, though, which is a special texture operation known as "Fetch4". We shall cover this feature in a little more detail in a later article.

Where's our Fetch4 article!!
 
Back
Top