Do I get it right that texture filtering is now done by the shaders ?
So for example for trilinear filtering 8 texels need to be fed to the shaders which then filters it to one value...
Obviously this saves some filtering ALUs. Probably also increases filtering accuracy as the fixed function filtering was only like 8-bits accurate.
As a lot more data needs to be passed to the shaders it seems the bandwidth from the L1 texture caches to the shaders has doubled.
Total L1 bandwidth claimed around 1 TB/s.
For bilinear 8-bit rgba you need 4*4 = 16 bytes per sample, so up to 1 TB / 16 =~ 60 GTex/s possible.
Per SIMD cluster L1 bandwidth is thus 1 TB / 20 = 50 GB /s
Bus width to SIMD from L1 should thus be 50 GB/s / 850 Mhz = 59 byte.
For sure this is 64 byte, making L1 bandwidth 20 x 64 x 850 = 1.088 TB/s and bilinear filter rate 68 GTex/s, exactly as claimed.
So for example for trilinear filtering 8 texels need to be fed to the shaders which then filters it to one value...
Obviously this saves some filtering ALUs. Probably also increases filtering accuracy as the fixed function filtering was only like 8-bits accurate.
As a lot more data needs to be passed to the shaders it seems the bandwidth from the L1 texture caches to the shaders has doubled.
Total L1 bandwidth claimed around 1 TB/s.
For bilinear 8-bit rgba you need 4*4 = 16 bytes per sample, so up to 1 TB / 16 =~ 60 GTex/s possible.
Per SIMD cluster L1 bandwidth is thus 1 TB / 20 = 50 GB /s
Bus width to SIMD from L1 should thus be 50 GB/s / 850 Mhz = 59 byte.
For sure this is 64 byte, making L1 bandwidth 20 x 64 x 850 = 1.088 TB/s and bilinear filter rate 68 GTex/s, exactly as claimed.
Last edited by a moderator: