Infinisearch
Veteran
If you have time could you sort back to front and tell me your performance differential if any?
front to back
back to front
and random
front to back
back to front
and random
In that example it seems quad-overshading could be a concern, and sadly nothing you can really do about. You could measure it precisely though.
If you have time could you sort back to front and tell me your performance differential if any?
front to back
back to front
and random
Also you can sort by screen space position. This improves ROP cache utilization.If you have time could you sort back to front and tell me your performance differential if any?
front to back
back to front
and random
OpenGL 4.4 has tessellation, indirect draw, compute, shaders and even MultiDrawIndirect (to replace ExecuteIndirect). If he needs to stick with OpenGL 3.3, then these improvements are not possible.@sebbbi He's OpenGL, not DirectX It might drive him a little crazy to achieve what you suggest. Additive note: tesselation factor of 0 is also good to cull.
This is a good idea if you only need spheres. Simple to implement and saves half of the work.You could render viewer facing hemispheres to avoid processing (most) backfaces, if the mesh is meant to be dense enough to get smooth silhouettes then you probably won't notice the rotation, either.
All the math you need:Take care. Spheres are not circles in screen-space under perspective projection.
Oh I did that in 2008 with a fragment shader, that was fun ^^You could also render screen space bounded quads and calculate sphere intersection analytically. Use discard (clip) when the pixel misses the sphere. Normal vector of a sphere's surface pixel = (pixel.position - sphere.position). If you want spheres to intersect properly with other scene geometry, you should output depth from the pixel shader. This method should be fully pixel shader (or fill) bound. You don't need to implement LODs either.
If you want me to test on a GTX 970 to see how many spheres that can handle just let me know?
I just remembered one last thing you can try. Make sure your vertex size won't cross a cache line boundary by making it 16 or 32 or 64 bytes per vertex per stream. Add padding to reach the goal. It can increase memory consumption but thats not much of an issue for your current use case. Also are you using 16 bit indexes'?
Your advice is correct. 16 bytes and 32 bytes are perfect targets for vertex size optimizations (and even 64, if you really need more data and your vertex shader is complex enough to hide the memory latency). Cache line aligning your vertex data is always a good idea.You should listen to sebbbi's size advice before mine... mine was unconfirmed from about 15 years ago so I don't know if it's still applicable. Oh and do me a favor and check the front to back rendering again on the new machines I curious if it's the card in question. (if you have time)