High Rez Textures and Texture Compression

Reverend

Banned
Reverend said:
The excitement (and focus) on shading is understandable but I have always found that high rez textures have a more immediate positive impact on how a game is perceived as having "great graphics".

1) Can you tell me what the IHVs and MS are doing wrt compression techniques?
2) Games are getting bigger and bigger in terms of texture/memory requirements but this doesn't necessarily have to apply to the entire game; we can have specific scenes where high rez textures can be used and other scenes where they aren't "necessary". Why haven't there been such games, taking into account (1) above?

Tim Sweeney said:
I wish the industry was doing a lot more to improve texture compression. There is definitely a market failure here (with the standardization of the API by Microsoft, no company really has a huge incentive to innovate at the moment). DXT1 offers 6:1 compression, while JPEG 2000 offers 100:1 compression at comparable quality levels. Obviously JPEG 2000 isn't practical in realtime, but I have a hard time believing that there isn't something in between DXT1 and JPEG 2000 that would offer a fairly huge breakthrough in the visual quality and memory tradeoffs.

But I don't have time to do anything about that.

-Tim

John Carmack said:
Reverend said:
The excitement (and focus) on shading is understandable but I have always found that high rez textures have a more immediate positive impact on how a game is perceived as having "great graphics".
I agree, and I have a solution. :)

John Carmack

Don't ask me to ask John what he means (his cheeky reply means he would've told me if he wanted to), as long as we know he's doing something.

Any thoughts about texture compression?
 
Frankly I'd suggest its basicly adding detail maps or using wang tiles or whatever. As i recon you'd be pretty hard pressed to save much bandwidth trying to use shaders to do some kind of compression you might cut down on the size if your clever maybe.
 
DXTC's hard to improve upon is because it's tuned for overall utility for realtime graphics.

The algorithms used in highly efficient compression methods almost always make use of entropy reduction algorithms that are tough to implement fast in hardware - if you think about the nature of say a Huffman decode it's not something that can be (easily? at all?) implemented on a single-digit-clock throughput pipeline.

Once you take those away, there are good reasons why it's hard to do better than 4 bits per pixel in a fixed blocksize texture compression method - roughly, it's because there are few images where no bits of it need about 4 bits per pixel to compress adequately given the limitations of 'fixed function' mathematical decompression. There's always some small but important subset that looks a bit nasty.
 
Would it make sense for the shader pipeline to compute the decompression of a texture and "ping-pong" it back into memory, for subsequent, normal, texturing?

I'm thinking of something that's along the lines of adaptive tessellation. If we're getting close to a future in which geometry is "multi-passed", could the same apply to textures?

Jawed
 
Jawed said:
Would it make sense for the shader pipeline to compute the decompression of a texture and "ping-pong" it back into memory, for subsequent, normal, texturing?
Jawed
IMHO, the main purpose of texture compression is not to save memory but to save memory bandwidth.
 
I used to agree, but nowadays I'm hearing a lot of clamour from the content side that they want more memory for textures. Certainly, though, if bandwidth is the real reason, then I don't think there's that much pressure to drop below 4 bits per pixel average - there'd be much more gain by increasing the percentage of textures that are stored in DXTC in the first place.

The ping-pong between memory is much like what PS2 does AFAIK - textures are decompressed as they travel between 'less local' and 'more local' memories. It's capable of being horribly inefficient; in the noddy implementations I'm imagining one pixel on the scene needs that texture and you take the whole hit of decompression, and I'd imagine it forces very strict rendering order restrictions on the hardware and so inhibits other render order optimisations like front-to-back. It's potentially very bad when things go to multitexture with more than 2 textures too.
 
I was thinking of a ping-pong rendering phase, rather than trying to do on-demand decompression.

For example in Xenos, with its predicated tiling, it "sorts" triangles into N bins, one bin per tile. That's, effectively, a rendering phase. Similarly with its adaptive tessellation, it seems to create a buffer containing tessellation factors and vertex data, which exists in between the first and second passes of tessellation.

While doing these passes, Xenos could also (I'm not suggesting it does - thinking of some future) identify the textures that need to be decompressed. It would then decompress textures from (say JPEG), writing them to a DXTC-texture buffer in memory.

Plainly there's still the issue of how much texture data is required to render any single frame. If that busts the 256MB/512MB limit then ping-ponged texture decompression isn't going to work.

I'm thinking more of the total texture data that's in the game - rather than the total texture data in a frame.

The other issue has got to be "HDR" textures - textures with more than 8 bits per channel.

In theory Lost Coast should be a test case for how current cards are coping. It's for "256MB cards only" isn't it?

Jawed
 
Another thing is virtual memory support coming in DX10 - so that textures are only sent to the graphics card as they're needed.

Jawed
 
It'll take time until we see real on chip adaptive tesselation. DX10 might be a step in the right direction when it comes to advanced HOS, but it's still not entirely where it could have been. IMHLO as always.

I personally wouldn't only like to see more advanced forms of texture but also advanced forms of geometry compression.
 
geo said:
Is there anything even on MS roadmaps for beyond DX10 at this point?

Pfffff ROFL....considering it's aiming to last at least as long as DX9.0, it's way too early to talk about any possible proposals from any side.
 
DXT and most common image compression schemes (jpeg, jpeg2000, png etc.) are fundamentally different. The main reason behind texture compression in realtime rendering is saving bandwidth (smaller memory usage is just a nice side effect :)) and you need constant time random texel access. That is why DXT is a block based compression with a fixed compression ratio (so you can have fast random lookup).

Traditional compression schemes are mostly for reducing image size, and the compression ratio in various portions of the image is different.

Then of course, you have intermediate things like VQ...
 
Ailuros said:
Pfffff ROFL....considering it's aiming to last at least as long as DX9.0, it's way too early to talk about any possible proposals from any side.

That's a pity. The limitations of a given DX rev were a lot less ouchful when it was a one year, or even two year thing. Now that we are 3+ years it is a bit on the painful side when this goodie or that gets deleted, yes?
 
geo said:
That's a pity. The limitations of a given DX rev were a lot less ouchful when it was a one year, or even two year thing. Now that we are 3+ years it is a bit on the painful side when this goodie or that gets deleted, yes?

The problem with the shorter life-cycles must have been that developers did not only have to adjust in short terms to any changes, they also didn't have a somewhat "safe prediction" on what is coming next. In DX9.0 they knew since day one that they have SM2.0, SM2.0+ and SM3.0 and it'll last about 4 years. I don't think DX10 has different shader models like in DX9.0, probably due to the unlimited resources. IHVs will most likely (as usual) support all required functionalities and gradually as time goes by concenctrate more and more on performance.

Besides especially for DX8.x it was an absolut mess with all the "my extension" "your extension" crap between IHVs and the according updates for the API that came along.

From what I've been reading the initial plans for DX10 were way more ambitious when they started out with it; as usual I suppose different IHVs were screaming that they won't have enough space to implement them all and there the various reductions came along. If developers themselves wouldn't ask or wish for more functionalities the whole story would be of less interest. I'd expect to see though finally the whole advanced HOS wizzbang with "DX Next Next", exactly because of what developers would like to see/have.
 
Imagine a compression algorithm that works in a way so that extracting area averages is a constant time operation (like a summap). It wouldn't need to be capable of arbitrary areas, emulating ripmaps would be more than enough IMO.
I'm wondering if such a compression technique would be useful. What do you think?
 
bloodbob said:
Frankly I'd suggest its basicly adding detail maps or using wang tiles or whatever. As i recon you'd be pretty hard pressed to save much bandwidth trying to use shaders to do some kind of compression you might cut down on the size if your clever maybe.

I agree 100% with that, but being clever takes time (for most of us :), and time is money ! Procedural texture are full of great opportunity, but they are a pain to properly integrate in a production pipeline. Where are the Tools !!!!

... Well, as I think about it, there one. Not dedicated to real-time I guess, but still, interesting enough:
http://www.allegorithmic.com/v2/zone_gallery_1.htm
Don't know much about it though...
 
Are people looking for better PSNR (quality) at a given compression ratio, or better ratios?

One possibility is to use a modified DXTC or VQ on YUV data instead of RGB data. Human beings are much more responsive to luminance artifacts than chroma.

For example, you could encode a 384-bit RGB block (not dealing with alpha here) at 4:1 (96 bits) like so:

Store two 8-bit Y (luminance) values, and use 3-bits per texel to store 8 different interpolated luminance levels are the block. This takes 64-bits.

For the chroma, store two U4:V4 values, and use a 4:2:0 sampling format, so that instead of storing 16 chroma interpolants, we store 4 U and 4 V at 2-bits per texel. This yields 16bits + 8*2 = 32-bits.

The entire 384-bit 4x4 block can be compressed to 96-bits. Slightly worse than DXTC 6:1, but perhaps with less visible artifacts?


Another option is to look for different mathematical representations for the interpolation. DXTC stores information as a linear parameterization. That is, it stores 2 endpoints of a line, and then stores positions along that line. One could look for non-linear representations, or a change in coordinate system.


Also, one might look to implement entropy coding by using a windowed algorithm on larger block sizes. Today, 4x4 block sizes are used, but an 8x8 blocksize would yield 256 values to play with, making entropy encoding much more useful.

Another possibility is an entropy encoding algorithm that provides a compact index that allows the GPU to efficiency calculate which memory locations to fetch for decoding. Some sort of hybrid windowed approach which synchronizes the encoded stream to block boundaries every so often (giving up some efficiency) would have to be used to keep the index compact.
 
Any thoughts on Tim's comment about IHVs not having any real incentive to innovate due to the DX standardization? Can't really be true because ATI must've invested quite a bit to come up with 3Dc.
 
Actually, I think 3Dc proves the rule. It's a minor tweak and hence not a significant risk or investment.

However, I'm not sure it is the standardization aspect that is delaying investment, since that doesn't stop IHVs from adding extensions in other areas that DX had already standardized.

I think the biggest impediment is that DXTC is perceived as "good enough" and that the complexity needed to achieve any significant improvement is substantial. To wit, if people want an order of magnitude increase in compression ratio, or a huge increase in quality at a given bitrate, then the result is that a much more complex algorithm must be used.

Perhaps the two-level compression technique is best. DXTC for Video Ram decompression, and a different compression format to make PCIE uploads must faster (say, JPEG2000 compressed in main memory, and decompressed by the GPU into video ram on texture upload)

Either way, it's enormous complexity increase, and of course, compressed storage in main system memory fights virtualization of textures.
 
Back
Top