Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 18-Aug-2002, 19:18   #1
OpenGL guy
Senior Member
 
Join Date: Feb 2002
Posts: 2,291
Send a message via ICQ to OpenGL guy
Default Kristof's comments on render to texture

Quote:
Originally Posted by Kristof
There is quite a good reason why rendered surfaces are slower to render from that regular uploaded textures. When you upload a texture its not just uploaded, the data is re-ordered aka twiddled/swizzled/whatever, this data re-ordering improves the cache hit ratio and make bilinear/trilinear operations pretty much guaranteed to be free. When you render to a texture you do so in scaline way, you do not render twiddled. So the data order for a rendered surface is not optimal for accessing as a texture
You are assuming that the memory tiling for textures are different from the tiling for backbuffers. This doesn't have to be the case, and your own argument is a good reason why

Chips with tiled memory, in my experience, don't render in a "scanline" way: That would defeat the purpose of having tiled memory. The rendering is done tile by tile in order to get the most memory bandwidth.

Just wanted to clear that up.
OpenGL guy is offline   Reply With Quote
Old 18-Aug-2002, 19:30   #2
Neeyik
Homo ergaster
 
Join Date: Feb 2002
Location: Cumbria, UK
Posts: 1,231
Default

I was just going to add something that I had noticed in the comments made in the previous thread before it was closed...

The question of the shadows was for the objects at the side of the road. AFAIK, render-to-textures shadows are used for the moving objects only. I'm not 100% certain of this (the text reads "dynamic shadows for non-static objects only") but if the shadows of the scenery were produced in some other way then perhaps R-t-T is not the cause for the attention.
Neeyik is offline   Reply With Quote
Old 18-Aug-2002, 20:10   #3
Kristof
Senior Member
 
Join Date: Jan 2002
Location: Abbots Langley
Posts: 732
Default Re: Kristof's comments on render to texture

Quote:
Originally Posted by OpenGL guy
Chips with tiled memory, in my experience, don't render in a "scanline" way: That would defeat the purpose of having tiled memory. The rendering is done tile by tile in order to get the most memory bandwidth.

Just wanted to clear that up.
There is still a potential difference between scanline->tiled scanlines and twiddled order. Not that I am excluding the possibility for current or future hardware to support hardware twiddling and rendering to a twiddled format.

But generally current hardware AFAIK does not render in twiddled form and thus sees a slow down when texturing from render targets.

K-
Kristof is offline   Reply With Quote
Old 18-Aug-2002, 20:21   #4
OpenGL guy
Senior Member
 
Join Date: Feb 2002
Posts: 2,291
Send a message via ICQ to OpenGL guy
Default Re: Kristof's comments on render to texture

Quote:
Originally Posted by Kristof
But generally current hardware AFAIK does not render in twiddled form and thus sees a slow down when texturing from render targets.
I wouldn't have said what I did if I didn't have strong evidence to the contrary
OpenGL guy is offline   Reply With Quote
Old 18-Aug-2002, 20:23   #5
Kristof
Senior Member
 
Join Date: Jan 2002
Location: Abbots Langley
Posts: 732
Default

I know but you only talk for one specific party
So how far back does this functionality go ?

K-
Kristof is offline   Reply With Quote
Old 18-Aug-2002, 20:33   #6
OpenGL guy
Senior Member
 
Join Date: Feb 2002
Posts: 2,291
Send a message via ICQ to OpenGL guy
Default

Quote:
Originally Posted by Kristof
I know but you only talk for one specific party
So how far back does this functionality go ?
At both graphics companies I've worked at, there has been no difference between texture tiling and surface tiling. It's bit depth, not content, that determines the tile layout.
OpenGL guy is offline   Reply With Quote
Old 18-Aug-2002, 20:49   #7
Grall
Invisible Member
 
Join Date: Apr 2002
Location: La-la land
Posts: 5,015
Default

Kristof:

Thanks for the explanation, but why would they make the feature 'intentionally' slow? If framebuffer writes are "swizzled" or whatever when written to optimize for bandwidth, why aren't texture render targets treated the same way? They are just like a small framebuffer after all!

It seems illogical to introduce special case scenarios, especially if all it'll accomplish is to slow things down...


-FaaR-
Grall is offline   Reply With Quote
Old 18-Aug-2002, 22:01   #8
Kristof
Senior Member
 
Join Date: Jan 2002
Location: Abbots Langley
Posts: 732
Default

There is another reason why render targets are usually slow when used as a texture. Generally they are not mipmapped which means that even though they might be swizzled on some hardware, the texture cache is still going to thrash due to lacking lower mipmap levels.

I am trying to find a twiddled data diagram for textures but can't seem to find one... sigh... I know there are out there... somewhere

K-
Kristof is offline   Reply With Quote
Old 18-Aug-2002, 22:27   #9
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by Kristof
There is another reason why render targets are usually slow when used as a texture. Generally they are not mipmapped which means that even though they might be swizzled on some hardware, the texture cache is still going to thrash due to lacking lower mipmap levels.
K-
That's why many developers also creates mip maps from their render targets. One can even use mip mapped RT without fill lower levels mipmaps to do some neat effect. (ie. render shadow maps and fill with white the lower mip map levels, if your shadow cast all the time far from the view point u see the shadow decrease its intensity going far from the observer, if the light is near the camera too. of course trilinear filtering should be activated )

ciao,
Marco
nAo is offline   Reply With Quote
Old 18-Aug-2002, 22:59   #10
Neeyik
Homo ergaster
 
Join Date: Feb 2002
Location: Cumbria, UK
Posts: 1,231
Default

Quote:
There is another reason why render targets are usually slow when used as a texture. Generally they are not mipmapped which means that even though they might be swizzled on some hardware, the texture cache is still going to thrash due to lacking lower mipmap levels.
Presumably this must depend on what the render target is being created for in the first place. For something like cube maps, surely it would make sense to have all mipmap levels possible, unless specifically told not to.
Neeyik is offline   Reply With Quote
Old 19-Aug-2002, 01:42   #11
Chalnoth
 
Join Date: May 2002
Location: New York, NY
Posts: 12,678
Default

I'd just like to say that things like swizzling really would make it seem like a darned good idea to have a separate texture-management unit on the GPU (Well, they already need to have something like this for TC...but it would be nice to use for render-to-texture situations).

Personally, it just seems obvious to me that swizzling of render targets will just be a natural optimization that will come when such things are common. Hopefully some high-end hardware will have such optimizations earlier.

Additionally, does anybody know if AGP textures are stored in swizzled format? That also just seems like a natural optimization to make. It would also be nice if virtual AGP texturing were implemented in upcoming video cards.

One final thing. Might not rendering in swizzled format actually improve memory bandwidth efficiency while rendering? (While at the same time possibly reducing it during buffer switch or display)
Chalnoth is offline   Reply With Quote
Old 19-Aug-2002, 03:05   #12
OpenGL guy
Senior Member
 
Join Date: Feb 2002
Posts: 2,291
Send a message via ICQ to OpenGL guy
Default

Quote:
Originally Posted by Chalnoth
Might not rendering in swizzled format actually improve memory bandwidth efficiency while rendering?
Is this not what I said above?
OpenGL guy is offline   Reply With Quote
Old 19-Aug-2002, 07:11   #13
Chalnoth
 
Join Date: May 2002
Location: New York, NY
Posts: 12,678
Default

Heh...guess it is. That's what I get for not reading carefully :P
Chalnoth is offline   Reply With Quote
Old 19-Aug-2002, 07:27   #14
Kristof
Senior Member
 
Join Date: Jan 2002
Location: Abbots Langley
Posts: 732
Default

Yes, you can create a mipmap tree when rendering, but then the question is how do you do that ? Or more specifically if the driver has support for it how does the driver/hardware do that ?

In the end :

uploaded texture = twiddled + mipmap pre-generated and uploaded once.

render target texture = twiddled (not guaranteed) + mipmap missing or generated at a cost and uploaded possibly for each frame.

MipMap generation can either be done by rendering the same scene multiple times at the various MipMap resolutions, possibly only a couple of levels. Or by downsampling the rendered topmap, similar to how AA is done. However its done its going to be an expensive operation.

K-
Kristof is offline   Reply With Quote
Old 19-Aug-2002, 17:05   #15
Chalnoth
 
Join Date: May 2002
Location: New York, NY
Posts: 12,678
Default

Quote:
Originally Posted by Kristof
MipMap generation can either be done by rendering the same scene multiple times at the various MipMap resolutions, possibly only a couple of levels. Or by downsampling the rendered topmap, similar to how AA is done. However its done its going to be an expensive operation.

K-
Rendering the same scene multiple times would be far too expensive to carry out (and the quality would probably be a fair bit lower), especially for highly complex scenes.

Downsampling the rendered top map would both carry with it automatic AA, and would be far less performance-intensive. For optimal performance, it would require dedicated hardware (Which may already exist in current hardware...), but that shouldn't be a huge deal. The main question here is, can the video card continue rendering while this dedicated hardware is working on generating the MIP maps? That is, is the rendered texture needed immediately after it is rendering, stalling further rendering until the downsampling is done?

It seems to me that if either the drivers or the software inserts a delay between when the texture is rendered and when it is needed, there needn't be much performance hit at all from generating the MIP maps.
Chalnoth is offline   Reply With Quote
Old 19-Aug-2002, 17:13   #16
Humus
Crazy coder
 
Join Date: Feb 2002
Location: Stockholm, Sweden
Posts: 3,216
Send a message via ICQ to Humus Send a message via MSN to Humus
Default

Using GL_SGIS_generate_mipmap (which is supported by R100 and up and GF3 and up + maybe some others) you can automatically generate mipmaps in hardware. It's quite fast actually, haven't noticed any significant performance reduction by using it. The rendering of the level 0 mipmap tends to be much more expensive than generating the mipmaps.
This is a feature I really miss in Direct3D. No autogeneration of mipmap basically renders RTT more or less useless, nice that it's going to be there in DX9 though.
__________________
[ Visit my site ]
I speak for myself and only myself.
Humus is offline   Reply With Quote
Old 19-Aug-2002, 17:24   #17
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by Kristof
Or by downsampling the rendered topmap, similar to how AA is done. However its done its going to be an expensive operation.
K-
Actually, in our engine is not expensive at all with several shadow maps (all contained in a single big render target). we tested it on a gf3/gf4/8500 with mip mapping on and off and we couldn't detect any significative difference in performance, and we are not cpu limited.

ciao,
Marco
nAo is offline   Reply With Quote
Old 19-Aug-2002, 18:25   #18
hax
Junior Member
 
Join Date: Jul 2002
Posts: 59
Default

Quote:
Originally Posted by Kristof
Or more specifically if the driver has support for it how does the driver/hardware do that ?
With a little bit of thought? :P

Quote:
Originally Posted by Kristof
render target texture = twiddled (not guaranteed) + mipmap missing or generated at a cost and uploaded possibly for each frame.
Some older hardware can't always render to texture if the texture is not a power of 2. If they do, therein lies the twiddling/non-twiddling issue. I suppose this type of limitation will be outdated in a few years once the older hardwares die off.

Quote:
Originally Posted by Kristof
However its done its going to be an expensive operation.
I wouldn't necessarily say expensive, unless it falls to some software mip generation method, or if you are mip generating and switching render targets for every other polygon.
hax is offline   Reply With Quote
Old 19-Aug-2002, 19:24   #19
fresh
Member
 
Join Date: Mar 2002
Posts: 141
Default

Quote:
Originally Posted by Kristof

MipMap generation can either be done by rendering the same scene multiple times at the various MipMap resolutions, possibly only a couple of levels. Or by downsampling the rendered topmap, similar to how AA is done. However its done its going to be an expensive operation.

K-
Have you ever tried this? It's very, very fast to generate mipmaps in hardware. A complete mipmap chain is 1.333 times as big as the top mip level. so a 512x512 texture only takes an additional ~87k pixels for the mipmaps. With 2 gpix fillrate, it's not big deal.
fresh is offline   Reply With Quote
Old 19-Aug-2002, 19:41   #20
OpenGL guy
Senior Member
 
Join Date: Feb 2002
Posts: 2,291
Send a message via ICQ to OpenGL guy
Default

Quote:
Originally Posted by fresh
Have you ever tried this? It's very, very fast to generate mipmaps in hardware. A complete mipmap chain is 1.333 times as big as the top mip level. so a 512x512 texture only takes an additional ~87k pixels for the mipmaps. With 2 gpix fillrate, it's not big deal.
I don't think Kristof was concerned about the fillrate. You'd have to send down the same geometry multiple times in order to generate the miplevels, and that could affect performance.
OpenGL guy is offline   Reply With Quote
Old 19-Aug-2002, 19:43   #21
RussSchultz
Professional Malcontent
 
Join Date: Feb 2002
Location: HTTP 404
Posts: 2,855
Default

I thought some hardware could generate mip-maps via successive decimation.

But what would I know?
RussSchultz is offline   Reply With Quote
Old 19-Aug-2002, 19:49   #22
fresh
Member
 
Join Date: Mar 2002
Posts: 141
Default

Quote:
Originally Posted by OpenGL guy
Quote:
Originally Posted by fresh
Have you ever tried this? It's very, very fast to generate mipmaps in hardware. A complete mipmap chain is 1.333 times as big as the top mip level. so a 512x512 texture only takes an additional ~87k pixels for the mipmaps. With 2 gpix fillrate, it's not big deal.
I don't think Kristof was concerned about the fillrate. You'd have to send down the same geometry multiple times in order to generate the miplevels, and that could affect performance.
Why would you need to send down the geometry multiple times? You can just down-sample the top mip level a few times. Hell, if we can do this on the PS2 I'm sure the R300 can do it . Humus already pointed out the relevant OGL extension and DX9 supports mipmap generation via downsampling as well. I'm sure you know all this, I'm just trying to figure out why Kristof thinks it's such a performance hit.
fresh is offline   Reply With Quote
Old 19-Aug-2002, 19:59   #23
Colourless
Monochrome wench
 
Join Date: Feb 2002
Location: Somewhere in outback South Australia
Posts: 1,255
Send a message via ICQ to Colourless Send a message via MSN to Colourless
Default

Well, remember this RTT stuff was originally because of 3D Mark a DirectX 8 program. DX8 doesn't have any method to automatically generate the lower mip levels. You pretty much must render each level individually.

There are 'other' methods that could be used i.e.
1) using 2 textures and manually downsampling one in to the other.
or
2) if you want to do something that might seriously break on different hardware, set render target to be a lower mip level of the current texture source. This would be a really 'bad' idea

-Colourless
Colourless is offline   Reply With Quote
Old 19-Aug-2002, 20:02   #24
Kristof
Senior Member
 
Join Date: Jan 2002
Location: Abbots Langley
Posts: 732
Default

Lots of little ones make one big... you need to read in the top level, write out a lower level, re-read that level, write out an even lower level, re-read that level, etc... for each render texture in the scene and for each frame... it all adds up to something. Not to mention that render target changes come at a cost as well.

Point mainly is its not "free" and will have some performance impact unless you are not bottlenecking yet.

But at least we all learned something more about render to texture and MipMap generation, which is a good thing (tm)

K-
Kristof is offline   Reply With Quote
Old 19-Aug-2002, 20:28   #25
fresh
Member
 
Join Date: Mar 2002
Posts: 141
Default

Quote:
Originally Posted by Kristof
Lots of little ones make one big... you need to read in the top level, write out a lower level, re-read that level, write out an even lower level, re-read that level, etc... for each render texture in the scene and for each frame... it all adds up to something. Not to mention that render target changes come at a cost as well.

Point mainly is its not "free" and will have some performance impact unless you are not bottlenecking yet.

But at least we all learned something more about render to texture and MipMap generation, which is a good thing (tm)

K-
Not every texture needs this. Only textures which are rendered to (like dynamic cube maps, shadow maps, etc). Generating mipmaps is just a texturing operations, which modern cards are freakin fast at, especially when you access the textures in linear fasion. Nope, it's not free, but it's very very very fast and I'd be shocked if runtime mipmap generation would show up on any kind of profiler. The actual rendering of the top level is what would take the most time. One Of These Days When I Have Time (tm) I'll whip up a small app to test this

Also want to point out that some cards can draw to a swizzled render target.
fresh is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Render to depth texture with EXT_framebuffer_object Ostsol 3D Technology & Algorithms 20 29-Oct-2005 01:29
Xenos Texture Filtering Quality Luminescent 3D Architectures & Chips 2 02-Aug-2005 23:21
Some thoughts on render to texture, memory and PS3 GPU JF_Aidan_Pryde Console Technology 0 23-Feb-2005 09:02
Render To Texture Target Sizes Dave Baumann 3D Technology & Algorithms 5 24-Oct-2004 16:24
Microsoft to own every GPU? Cyborg 3D Architectures & Chips 26 14-Jul-2002 11:15


All times are GMT +1. The time now is 02:31.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.