Flexible texture formats

Xmas

Porous
Veteran
Supporter
I was wondering, with the large numbers of different texture formats, the growing use of material property textures, and the upcoming requirements of WGF2.0 including unnormalized integer texture reads, wouldn't it be useful to have a system that flexibly maps texture contents to output channels?

What I was thinking of is this:

In the TMU, first, there is a bit-mapping layer that maps the bits of a single texel to up to four channels. A texel could be 8, 16, 32, 64 or 128 bits wide. These are split into up to four channels, each being 1-12, 16, 24(?) or 32 bits wide. A few bits might be left unused. Some alignment might be required. As an alternative to bit-mapping of single texels, a compressed texture format could be selected. This should encompass all existing texture formats as well as a whole load of new ones.

After that, there is a type-mapping layer. Each channel could be of one of the following types:
unsigned normalized (0 to 1)
signed normalized (-1 to 1)
unsigned unnormalized (e.g. 0 to 255)
signed unnormalized (e.g. -128 to 127)
color (0 to 1, sRGB converted if desired)
FP (only for 16bit, 24bit and 32bit channels)
zero or one (those would be used to fill the unused output channels)

As a last step after filtering, input channels are mapped to output channels. This is to account for e.g. luminance being mapped to red, green and blue in current texture formats. Actually, this step is basically obsolete with hardware that can do arbitrary swizzle.


So for example, some existing formats would be:
R8G8B8A8: bit-map: 8:8:8:8, type-map c:c:c:un, swizzle xyzw
X8R8G8B8: bit-map: 8:8:8:8, type-map 0:c:c:c, swizzle yzwx
A8L8: bit-map 8:8, type-map un:un:0:0, swizzle yyyx

But you could also do some very helpful packing, like one FP16 and two 8bit values in a single texture.

I'd like to see something like this in WGF2.0, but I think the chances are very slim.
What do you think about it?
 
Well, I would really be surprised if current hardware wasn't already doing something like that at the driver level. Would be nice to expose it to developers, if so.

But, to play devil's advocate, it seems like there may be some QA issues with supporting such combinations. That is to say, if you're developing a game, you'd have to check many more options to see if this one texture format you'd like to use will work. But with hardwired texture formats in the API, you only need to check one.
 
I'm sure I've seen a texture topology like that before, in a paper somewhere, but I can't remember. Either way, it stands out for me because it sounds pretty sensible and somewhat worthwhile.

There's lots of scope for making it easy to work with at the tool level, too, I think.
 
Chalnoth said:
But, to play devil's advocate, it seems like there may be some QA issues with supporting such combinations. That is to say, if you're developing a game, you'd have to check many more options to see if this one texture format you'd like to use will work. But with hardwired texture formats in the API, you only need to check one.
Why? There'd have to be a function that takes the desired texture format (maybe as a string representation) and returns whether it's valid/what are its limitations.
There would be some validity rules, but other than that you could expect any format to just work, bar FP filtering maybe. That's only considering texture formats, render target formats would still be limited.
 
For example making textures more like vertex buffers. Expand IDirect3DVertexDeclaration interface to textures for example. This would be nice. :)
 
A couple questions:
- would you want to do this outside of the shader so that you could pass an arbitrary format through texture filtering hardware?
- how does this compare to in shader pack/unpack instructions?
 
psurge said:
A couple questions:
- would you want to do this outside of the shader so that you could pass an arbitrary format through texture filtering hardware?
Of course that should be part of the TMU. It would be far to expensive to do using shader instructions, assuming you can use integer ops. And you'd have to do the filtering yourself.

- how does this compare to in shader pack/unpack instructions?
Pack/unpack instructions offer far less flexibility, having only four pack modes, all of which the TMU can already read.
But flexibility is exactly the point of this, being able to use arbitrary texture formats (including the existing, "fixed" formats) and still getting filtering on them.
 
Xmas - thanks, I think I understand the rationale now.

IMO this would make a lot of sense for render targets as well, although I suppose that would a bigger change in that blending would have to be generalized as well. Coming from a software viewpoint, it also seems like it shouldn't be very hard to provide lossless compression for all render targets (by encoding a tile of target pixels as a set of reference points and differences) so long as the channels were integral.
 
Xmas said:
I was wondering, with the large numbers of different texture formats, the growing use of material property textures, and the upcoming requirements of WGF2.0 including unnormalized integer texture reads, wouldn't it be useful to have a system that flexibly maps texture contents to output channels?..... <snip>

It would certainly be flexible, I've considered it myself, but it would mean even more state to change each time you change texture <shrug>
 
Simon F said:
It would certainly be flexible, I've considered it myself, but it would mean even more state to change each time you change texture <shrug>
Well, I'm not an ASIC expert, but I expected transistor cost to be more important than state change cost. What exactly is the limiting factor there?

How does current hardware handle different texture formats? With the exception of 16bit 565/1555 textures and sRGB conversion, all channels are treated the same. So I suppose there's a single type information (e.g. signed FX, unsigned FX, FP) for all channels and there's a 16bit to 32bit converter frontend that only supports two or three modes (565, 1555, 4444). sRGB conversion can be either on or off, only considering three channels.

The amount of information needed for other stuff like filtering mode, wrap mode, texture base address, texture dimensions, # of mipmaps, LOD min/max/bias, etc. stays the same. So would flexible texture formats really make such a big difference?
 
Simon F, do you mean expensive as in "large amount of memory required to hold the state" or as in "complicated to map a format to the texturing hardware (expensive in terms of control logic)"?
 
Xmas said:
Well, I'm not an ASIC expert, but I expected transistor cost to be more important than state change cost. What exactly is the limiting factor there?
I wasn't saying it's impossible just that there will be a cost associated with it.

I'd imagine that in most chips there are, perhaps, between 16 and 64 different texture formats, and so they can be identified with 4~6 bits. Off the top of my head, with a flexible format, this would need, say,
  • Data width (8/16/32/64/128 bits): say 3 bits
  • "Channel" pos and width: ~5+5 bits (assuming position anywhere in a 64 bit window) + Format, say another 2 bits.
If you have 4 channels that's probably around 50 bits and that doesn't include texture compression algorithms, or fancier formats (e.g. cube map) etc.
The amount of information needed for other stuff like filtering mode, wrap mode, texture base address, texture dimensions, # of mipmaps, LOD min/max/bias, etc. stays the same. So would flexible texture formats really make such a big difference?
Maybe, maybe not, but it does represent more bits that need to be registered at each pipeline stage (unless you force the pipeline to empty before changing textures) so it certainly doesn't come for free <shrug>

psurge said:
Simon F, do you mean expensive as in "large amount of memory required to hold the state" or as in "complicated to map a format to the texturing hardware (expensive in terms of control logic)"?
Neither really - more the cost of keeping all the conversion parameters for all the texturing operations that are 'in flight' inside the hardware. IHVs tell ISVs that state changes are costly and I can't see this making it any cheaper.

I guess it's a 'RISC vs CISC'-like argument. Reduced texture formats and leaner HW + more 'compilation'(i.e. development) effort VS Generic texture format and 'lardier' HW.
 
Simon F said:
I'd imagine that in most chips there are, perhaps, between 16 and 64 different texture formats, and so they can be identified with 4~6 bits.
I had thought about that, but imagined that it would be more likely to encode the necessary state changes directly rather than using a translation table. That would result in a few more bits used, depending on the capabilities of the TMU.

Off the top of my head, with a flexible format, this would need, say,
  • Data width (8/16/32/64/128 bits): say 3 bits
  • "Channel" pos and width: ~5+5 bits (assuming position anywhere in a 64 bit window) + Format, say another 2 bits.
For the system described, you'd only need channel width, position is redundant information since padding bits should only be allowed at the end. That would be 5 bits per channel, or even 4 if you only allow 1-12, 16, 24, 32. Add 3 bits for type information per channel.

Data width could either be derived from combined channel widths (next higher POT), or stored separately (3 bits). If you need the final swizzle stage, that's another 8 bits, 2 per channel.

So overall, that's 36 to 43 bits, maybe 32 bits more than a fixed format approach.


If you have 4 channels that's probably around 50 bits and that doesn't include texture compression algorithms, or fancier formats (e.g. cube map) etc.
Cube maps are about texture topology, not texel format.
AFAIK, ATI and NVidia decompress into texture cache, so TC is an earlier stage separated from the texel fetch and filtering. I guess at least some PowerVR chips do this differently, considering Kyro's ability to do faster trilinear with DXT1 compressed textures.

Maybe, maybe not, but it does represent more bits that need to be registered at each pipeline stage (unless you force the pipeline to empty before changing textures) so it certainly doesn't come for free <shrug>
Nothing comes for free, it's all about sound compromises. So the question is, is that additional flexibility worth the cost? Probably not, as long as there's no API support :)
 
Simon F said:
Neither really - more the cost of keeping all the conversion parameters for all the texturing operations that are 'in flight' inside the hardware. IHVs tell ISVs that state changes are costly and I can't see this making it any cheaper.
Could always use a lookup table via unused constant parameters.
 
Back
Top