Gubbi said:
Your Z buffer has to be the same resolution as your backbuffer, -otherwise you won't be able to determine which subpixels to cull.
That is also true... I forgot about that... lol
Ok let's recalculate... that would mean 37.5 MB for back-buffer and Z-buffer and ~1.18 MB for the front-buffer.
I doubt that the Visualizer will only have 32 MB of e-DRAM ( I think 64 MB for the Visualizer would be fair enough to assume [I would expect only 32 MB, maximum, on the Broadband Engine] ).
I think 16 bits for the Z-buffer ( we migth do W-buffering to compensate for the distributiuon range problem that Z-buffer natively has and the worsening of that problem by using only 16 bits ) might be enough though especially since we resolved lots of Z-fighting by doing deferred Shading ( models tagged with displacement mapping would be sent regardless though )...
In the case of a 24 bits Z-buffer...
14.04 MB + 18.72 MB = 32.76 MB and the front-buffer is only 1.18 MB
In the case of a 16 bits W-buffer/Z-buffer...
9.375 MB ( W/Z-buffer ) + 18.72 MB + 1.18 MB = 29.275 MB
If the e-DRAM were only 32 MB this would leave 2.725 MB for Texture Space plus you would have 2 MB of Local Storage in the APU and the Image Caches to play with...
Not a problem as the Visualizer is not the only one that samples textures: texture sampling is mainly done in the Shading phase ( except for displacement maps ) and that will be distributed across the Broadband Engine and the Visualizer...
Also, we can still stream ( compressed ) textures from the external XDR... 25.6 GB/s is nothing to laugh at...
Thinking about having 2.725 MB in the Visualizer and Broadband Engine e-DRAM would mean a total of 5.45 MB of compressed textures per frame ( not counting Texture Streaming [Virtual Texturing] from the external XDR )...
If we can de-compress in real-time VQ/S3TC compressed textures ( achieving 1:6-1:8 compression ratios ) with the APUs this would mean 32.7-43.6 MB of uncompressed textures/frame...
This would mean up to ~8-11 "full" 1,024x1,024 32 bpp textures, mip-mapping excluded.
Not much, it would seem ( still not too bad ), but we are forgetting that we will be using a few procedurally generated ones ( perfect for ground textures as well ) and that we will not need to store full-textures if we upload only the visible texels ( again taken some ideas from a Virtual Texturing approach ).
Please correct the wrong assumptions you think I made...
Also with 16x supersampling you get a 16x increase in number of micropolys. The whole idea of micropolys is that you don't have to scan-convert them (so they have to be smaller than your subpixels).
That is fine, I said we would be Shading limited more than fill-rate limited...
As long as the Shading part can provide us with enough micro-polygons we should be all set...
640 * 480 * 4 ( each micro-polygon is 1/4th of a pixel ) * 16 = ~19.6 M micro-polygons/frame, which at 60 fps would mean ~1.16 Billion micro-polygons/s...
I recognize that maybe 19.6 M micro-polygons/frame might be a bit on the high-side, but I would not advice to use this approach for a racing game or a fighter...
With 16x AA and high quality motion blur that a REYES approach would provide, for lots of games we could have a stable 30 fps approach and that would mean ~580 M micro-polygons/s which is achievable by the Broadband Engine and the Visualizer... Or you could go at 60 fps with 8x AA... the decision is yours...