Lately I have been optimizing our deferred rendering geometry pass vertex and texture inputs. In geometry buffer rendering, the chip bandwidth is usually a limiting factor while the chip ALUs are almost idle. Because of this, additional ALU usage has almost no performance impact, and the ALU should be used to decompress the information as much as possible.
This subject has not been talked that much in publications yet, so I though it would be best to ask for some educated advice. I have around 10 year experience in 3d-graphics programming, but this area is kind of new to me.
This is my current geometry vertex format in our DX9 render path:
Position:
- I started optimizing from 32 bit float 3d vector (12 bytes).
- Now: One 16 bit float 4d vector (8 bytes).
- Compression ratio: 1.5x
- Notes: Something needs to be stored in the w-channel!
- Quality: All our object vertices are centered around the object center point, and all objects are of manageable size (object based culling also needs this to work at best performance). Because of these factors, the 16 bit float precision has not shown any visible artifacts in our testing.
Texturecoordinates:
- I started optimizing from 32 bit float 2d vector (8 bytes).
- Now: One 16 bit float 2d vector (4 bytes).
- Compression ratio: 2.0x
- Quality: No quality degration can be seen with our content.
Normal, binormal, tangent:
- I started optimizing from 3x 32 bit float 3d-vector (36 bytes).
- Now: One signed 16 bit normalized 4d vector (SHORT4N) (8 bytes). I store normal x,y in x,y channels and tangent x,y in z,w channels. I reserve one bit of each component for cross product sign.
- I calculate normal.z and tangent.z with cross products. And multiply them by the sign bits (stored in one bit of the x and z channel).
- I calculate binormal with cross product from normal and tangent and multiply it by the sign bit (stored in one bit of the w channel).
- Compression ratio: 4.5x
- Quality: No quality degration can be seen. After decompression you basically have all 3 normal vectors reconstructed in 15-15-15 bit precision. The precision is actually too good...
Total:
Original vertex with single texturecoordinates/tangents: 56 bytes
Compressed vertex with single texturecoordinates/tangents: 22 bytes
Current compression ratio: 2.54x
Extra instructions for decompression:
- No extra calculations for position and texturecoordinates.
- For normal/binormal/tangent I have to separate the sign bits, calculate normal and tangent z-components (sqrt(1 - x^2 - y^2) * sign), and calculate binormal (cross(normal, tangent) * sign).
Currently I am pretty happy with the normal/tangent/binormal compression, but the texturecoordinate and especially the position compression still needs some work. What kind of vertex compression have you used in your projects or have uncompressed floats served your project just fine?
This subject has not been talked that much in publications yet, so I though it would be best to ask for some educated advice. I have around 10 year experience in 3d-graphics programming, but this area is kind of new to me.
This is my current geometry vertex format in our DX9 render path:
Position:
- I started optimizing from 32 bit float 3d vector (12 bytes).
- Now: One 16 bit float 4d vector (8 bytes).
- Compression ratio: 1.5x
- Notes: Something needs to be stored in the w-channel!
- Quality: All our object vertices are centered around the object center point, and all objects are of manageable size (object based culling also needs this to work at best performance). Because of these factors, the 16 bit float precision has not shown any visible artifacts in our testing.
Texturecoordinates:
- I started optimizing from 32 bit float 2d vector (8 bytes).
- Now: One 16 bit float 2d vector (4 bytes).
- Compression ratio: 2.0x
- Quality: No quality degration can be seen with our content.
Normal, binormal, tangent:
- I started optimizing from 3x 32 bit float 3d-vector (36 bytes).
- Now: One signed 16 bit normalized 4d vector (SHORT4N) (8 bytes). I store normal x,y in x,y channels and tangent x,y in z,w channels. I reserve one bit of each component for cross product sign.
- I calculate normal.z and tangent.z with cross products. And multiply them by the sign bits (stored in one bit of the x and z channel).
- I calculate binormal with cross product from normal and tangent and multiply it by the sign bit (stored in one bit of the w channel).
- Compression ratio: 4.5x
- Quality: No quality degration can be seen. After decompression you basically have all 3 normal vectors reconstructed in 15-15-15 bit precision. The precision is actually too good...
Total:
Original vertex with single texturecoordinates/tangents: 56 bytes
Compressed vertex with single texturecoordinates/tangents: 22 bytes
Current compression ratio: 2.54x
Extra instructions for decompression:
- No extra calculations for position and texturecoordinates.
- For normal/binormal/tangent I have to separate the sign bits, calculate normal and tangent z-components (sqrt(1 - x^2 - y^2) * sign), and calculate binormal (cross(normal, tangent) * sign).
Currently I am pretty happy with the normal/tangent/binormal compression, but the texturecoordinate and especially the position compression still needs some work. What kind of vertex compression have you used in your projects or have uncompressed floats served your project just fine?