Techniques awful in their intended use, but brilliantly repurposed

inlimbo

Newcomer
I've got one example on my mind: pixel art scaling algorithms like 2xSal and hqx.

At best they offer a half-there solution to what was never really a problem in the first place. But lately developers have found a way to use these alogrithms in a pretty genius way, as a means of smoothing very low-res shadow maps. (Arkham Knight is one notable example, and the first time I noticed it.)

It's a brilliant repurposing of a graphical technique with only limited previous utility, as it lets developers throw a bunch of low-res shadow maps at a scene without without a care, saving on performance and even delivering appropriately blobby shadows in those instances where you'd expect them.

But that's the only example I've got offhand. Can you think of any others?
 
LOVE the thinking behind this thread. Hopefully it won't be forgotten all too soon. :)
 
What about sebbbi's MSAA trick? That seems to work wonders at least on the consoles.
 
Guerrilla Games took advantage of 2xMSAA Quincunx to increase shadow sampling at the time. Was quincunx awful? :p

Arkham Knight is one notable example, and the first time I noticed it.

Not sure if the particular devs did the same exactly, but I've seen blobby shadows in Serious Sam 3 and the player shadow in Halo 4.
 
I can't be sure it's quite the same technique, but it's gotta be roughly the same idea, right? I mean especially the projected shadows used in interiors in Arkham Knight.
 
Guerrilla Games took advantage of 2xMSAA Quincunx to increase shadow sampling at the time. Was quincunx awful? :p
They actually used 2xMSAA buffer to supersample lighting, then used half amount of shadowsamples for each MSAA subsample. (Thus 1x samples for pixel and half when each MSAA subsample had different surface.)
Quincunx sampling is done a lot later so it didn't really affect shadows.
 
You know, now that I'm watching Zelda Wiiu gameplay it occurs to me that Arkham Knight isn't the first time I noticed that kind of filtering on shadow maps - Wind Waker HD is. And it makes great use of it.
 
They actually used 2xMSAA buffer to supersample lighting, then used half amount of shadowsamples for each MSAA subsample. (Thus 1x samples for pixel and half when each MSAA subsample had different surface.)
Quincunx sampling is done a lot later so it didn't really affect shadows.
Yeah, quincunx could be nice for cheap blurring of VSM/EVSM shadow maps. We used 4xMSAA on Xbox 360 in our EVSM shadow map rendering implementation (MSAA was free on Xbox 360, as long as the RT was small enough to fit EDRAM at once).

Modern GPUs have programmable sampling patterns, so flipquad pattern is preferable to quincunx pattern, as it produces much better coverage. Cost is identical. http://fileadmin.cs.lth.se/cs/Personal/Tomas_Akenine-Moller/pubs/flipquad_tr.pdf. But like quincunx it also blurs the image a bit, so it is not good enough for visible geometry (shadow antialiasing would work fine). Some devs have tried to use flipquad for visible geometry with a smart sharpening filter, but I personally don't like first blurring and then sharpening (as it is information loss after all).
 
Some low level stuff (also works on PC):
- Abusing MSAA hardware for variable rate shading.
- Abusing MSAA hardware replication (and compression) for per triangle constant data, such as triangle/material id.
- Abusing MSAA resolve for blurring shadows. Especially on Xbox 360, as the MSAA was free, and MSAA resolve cost was identical to standard resolve (no extra BW cost).
- Abusing DDX/DDY fine+coarse (derivatives) to share data between 2x2 quad pixel shader threads.
- Future: Abusing delta compression ("fast clear bit") for sparse data structures
- Future: Run "compute" tasks as pixel/vertex shaders concurrently with rendering on DX12/Vulkan. Works also on Nvidia/Intel hardware. See DDX/DDY trick above for data sharing (as LDS is not available).

Many engines have temporal AA. The same reprojection / caching pipeline is also nowadays used for stochastic sampling (of AO, reflections, transparency, decals, etc, etc).

Update: Most of these techniques are not "awful" for their intended purpose. 4xMSAA was almost impossible to use for its intended purpose on Xbox 360 as the 10 MB EDRAM was too small to contain a 4xMSAA main render target. MSAA popularity (for antialiasing) has also decreased a lot because deferred shading does not combine with it well. MSAA hardware is still very useful for these tricks. Compute on pixel/vertex shaders is not awful, but it is not as good as real compute shaders. Unfortunately it is the only way to achieve concurrent execution on DX12/Vulkan on Nvidia / Intel, meaning that "hacky" vertex/pixel shader compute will (again) be popular.
 
Last edited:
Some low level stuff (also works on PC):
- Abusing MSAA hardware for variable rate shading.
- Abusing MSAA hardware replication (and compression) for per triangle constant data, such as triangle/material id.
- Abusing MSAA resolve for blurring shadows. Especially on Xbox 360, as the MSAA was free, and MSAA resolve cost was identical to standard resolve (no extra BW cost).
- Abusing DDX/DDY fine+coarse (derivatives) to share data between 2x2 quad pixel shader threads.
- Future: Abusing delta compression ("fast clear bit") for sparse data structures
- Future: Run "compute" tasks as pixel/vertex shaders concurrently with rendering on DX12/Vulkan. Works also on Nvidia/Intel hardware. See DDX/DDY trick above for data sharing (as LDS is not available).

Many engines have temporal AA. The same reprojection / caching pipeline is also nowadays used for stochastic sampling (of AO, reflections, transparency, decals, etc, etc).

Update: Most of these techniques are not "awful" for their intended purpose. 4xMSAA was almost impossible to use for its intended purpose on Xbox 360 as the 10 MB EDRAM was too small to contain a 4xMSAA main render target. MSAA popularity (for antialiasing) has also decreased a lot because deferred shading does not combine with it well. MSAA hardware is still very useful for these tricks. Compute on pixel/vertex shaders is not awful, but it is not as good as real compute shaders. Unfortunately it is the only way to achieve concurrent execution on DX12/Vulkan on Nvidia / Intel, meaning that "hacky" vertex/pixel shader compute will (again) be popular.
Yup, old good MSAA hardware still has lot to offer it seems.
Hoping for 16x support for hardware, should be fantastic to have full 4x4 sample grid.

Been wondering about rotated grid MSAA and reconstruction to bigger resolution for a while.
Something like using rotated grid 4xMSAA on 1080 to get 4k frame with 2x subpixel information for long vertical/horizontal edges. (8x for 4.)
Not sure if it is feasible to do, or is it just too much work for the cost. (small detail might look silly.)

If rotated or 8 rooks patterns are no go, using 8x MSAA to reconstruct to ordered 3x3 grid with single 'hallucinated' sample might be interesting as well. (mid sample or perhaps changing location each frame.)
Rendering into '640x360 buffer' now a days sounds so perversly fun and old school.
 
Back
Top